Home > Archive > dBASE Questions and Answers > August 2005 > Index Expression Too Long - VDB 5.7









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Index Expression Too Long - VDB 5.7
V Harris

2005-08-01, 3:25 am

I use an index on one field of a table to put the table into address order. The index works on most of the different address elements in the field -- but not all.

I have already shortened the commands to their 4 characters and removed all the extraneous spaces. Still the index expression is quite long and the 220 character limit has prevented me from modifying the index to take into account the other elements in t
he address field.

I'm wondering if there is a way to build expressions that if fully typed out would normally exceed the 220 character limit?

For instance, the field name which contains the addriess is nameds ADDR1, which is typed out in the expression 11 times. Would creating and using a short memory variable to substitute for ADDR1 in the expression be a proper way to shorten the existing ex
pression so that other characters could be added to the existing index expression?

What other ways would there be to add more sorting capability to the index expression while staying within the 220 character expression limit?

FYI, here is the existing index line in the program:

index on rtri(iif(val(subs(ad
dr1,at(' ',addr1)))=0,;
LTRI(subs(addr1,at('
',addr1)+2)),;
LTRI(str(val(subs(ad
dr1,at(' ',addr1)))))))+;
iif(at(' ',addr1) < at(' ',addr1),'',subs(add
r1,at(' ',addr1)+1,1));
+str(val(addr1)) to STREET

Any suggestions on how to add sorting for more address elements -- and stay within the 220 character limit -- would be greatly appreciated.

V Harris

P.S. You don't really need to make sense of the index above, but if any of you are interested, here are the elements in the address line I use:

HOUSE NUMBER <space> FRACTION OF HOUSE NUMBER <space> STREET DIRECTION <space><space> STREET NAME <space> STREET SUFFIX <space> POST STREET DIRECTION <flush right> </> ZIP CODE

The double space <space><space> element in the address line above is the marker that the index uses to separate the house number from the street so that it can sort on street first, then on the street direction second, then on house number third.

I use a slash to separate the street name from the zip code because the index doesn't need to account for the zip code, and so should disregard the slash and anything after the slash. The zip code appears in the ADDR1 line when the zip code for the physi
cal address (ADDR1) is different than the zip code for the mailing address (ADDR2). When they are different, I have to put the zip code of the physical address in the ADDR1 field.

If you work the index through, you will see that there are several elements of the street address line that are not accounted for in the index line. I'm trying to figure out how to account for them and still stay within the 220 Character limit.

THX
Mervyn Bick

2005-08-01, 3:25 am

On Mon, 01 Aug 2005 08:41:17 +0200, V Harris <VHarris001@aol.com> wrote:

<snip>
>
> I'm wondering if there is a way to build expressions that if fully typed
> out would normally exceed the 220 character limit?
>
> For instance, the field name which contains the addriess is nameds
> ADDR1, which is typed out in the expression 11 times. Would creating
> and using a short memory variable to substitute for ADDR1 in the
> expression be a proper way to shorten the existing expression so that
> other characters could be added to the existing index expression?
>
> What other ways would there be to add more sorting capability to the
> index expression while staying within the 220 character expression limit?
>

<snip>


You may be able to do what you need by splitting the index expression into
variables and then using the macro operator to recombine them (INDEX ON
&a1.+&a2. ... TO STREET) but I suspect that is also going to hit the
expression limit. Unfortunately I don't have vDB 5.7 loaded to try it.

You could, however, achieve the desired result by adding seven (or however
many you want) fields to the table and filling them with the individual
parts of addr1 that you are interested in. Your index would then be INDEX
on a1+a2... to STREET.

Mervyn

--
The ns_ in my address is an attempt to thwart SPAM.
Roland Wingerter

2005-08-01, 3:25 am

V Harris wrote:
> I use an index on one field of a table to put the table into address
> order. The index works on most of the different address elements in
> the field -- but not all.
>
> FYI, here is the existing index line in the program:
>
> index on rtri(iif(val(subs(ad
dr1,at(' ',addr1)))=0,;
> LTRI(subs(addr1,at('
',addr1)+2)),;
> LTRI(str(val(subs(ad
dr1,at(' ',addr1)))))))+;
> iif(at(' ',addr1) < at(' ',addr1),'',subs(add
r1,at(' ',addr1)+1,1));
> +str(val(addr1)) to STREET

-------
Add one extra field to your table with the sort string, fill in the values
programatically.

BTW, an index expression with RTrim() and LTrim() is not a good idea anyway,
because variable length indexes are prone to corruption.

Roland


V Harris

2005-08-07, 8:23 pm

THX for your suggestions. I decided on the solution of adding a field to the table for each address element, then populate the fields with the individual elements, then sort on the fields. I have a couple of questions..

The first is about indexing.

Most roads have names, but many roads are officially 'numbered' roads. That is, the proper name of the road is the actual number -- not the number spelled out.

Since the index needs to sort on road names, the best way I can see to properly index on road names is to add TWO fields to the table.

First, populate one field (A4) with the str(val(road name)). This puts the string equivalent of the value (if any) into the field flush-right, so that when indexing, the leading blanks serve to index the value properly.

Second, populate the next field (A5) with the string street name (whether numeric or character).

Then when indexing on A4 + A5, both the numeric and character street names are properly sorted.

The same thing is done for the house number -- two fields are used, A1, which is populated with the string of the value of the house number, and A2, which is populated with the string house number. This is done to account for any string characters that m
ight be embedded in the house number.

But is there a shorter workaround that wouldn't require TWO fields for each of the street name and the house number?

Thanks again,

V Harris

Here is the code I've added to the program:

*------------------------------------------------------------------------------
*------------------------------------------------------------------------------
NOTE : On August 7, 2005, five (5) fields were added to the table DIRECTRY.dbf.
NOTE : They include A1, A2, A3, A4, A5. They represent individual address
NOTE : elements taken from the field ADDR1. They are populated by this program,
NOTE : and are then used by an index command
NOTE : to create a STREET index for the addresses in ADDR1.
NOTE : The below lines were also added to this program to populate the fields
NOTE : with the data from ADDR1, then to create the STREET index.
*------------------------------------------------------------------------------

repl all A1 with str(val(addr1)) && gets the value of the house number out of ADDR1
repl all A2 with substr(addr1,1,at(' ',addr1)) && to account for any characters in the house number
repl all A3 with iif(at(' ',addr1) < at(' ',addr1),'',subs(add
r1,at(' ',addr1)+1,1)) && gets street direction
repl all A4 with str(val(subs(addr1,a
t(' ',addr1)))) && gets value of street names
repl all A5 with rtrim(substr(addr1,a
t(' ',addr1)+2,;
iif(at('_',addr1) <> 0, (at('_',addr1) - (at(' ',addr1)+2)),30))) && gets street names
index on A4 + A5 + A3 + A1 + A2 to STREET

NOTE : In the future, try to combine the lines for A1 & A2 into one field, and
NOTE : the lines for A4 & A5 into one field.
*------------------------------------------------------------------------------
*------------------------------------------------------------------------------

Mervyn Bick

2005-08-08, 7:24 am

On Sun, 07 Aug 2005 20:44:57 +0200, V Harris <VHarris001@aol.com> wrote:

......
> The first is about indexing.
>
> Most roads have names, but many roads are officially 'numbered' roads.
> That is, the proper name of the road is the actual number -- not the
> number spelled out.


When you say "numbered roads", what form does this take? And are they
always in the same format? In my neck of the woods we have "Road No 3",
Third Ave, 3rd Ave, and just to complicate things we also use Afrikaans
which would be "Pad Nr 3", Derdelaan (with no space) and 3de Laan (with a
space). And to add a bit more spice, in Afrikaans the street number comes
after the street name. So "123a Second Ave", "123a 2nd Ave", "Tweedelaan
123a" and "2de Laan 123a" are all the same address. To ensure consistency
we would need some strict data entry rules backed up by a big stick! <g>

Before we get too deep into trying to parse the addr1 field into
individual elements, how about giving some actual examples of all the
formats you have for existing records?

For the future you should seriously consider revising the data input
screen so that the individual elements go into separate fields. (Apartment
no, Building name, Street no, Street no ext, Street name, Street
direction....) If you really want it, you can concatenate them before
saving them to addr1 as well as to their respective fields. There is,
however, probably no need for addr1 in the file if you had all the bits in
separate fields as you can build it on the fly any time you need it.

You also need to give some thought to normalising your database. As an
example, you are (or could be) storing the same steet name in record after
record. You should have a table where each street name is stored once
with each record in the main table refering to the Streetname table.


> The same thing is done for the house number -- two fields are used, A1,
> which is populated with the string of the value of the house number, and
> A2, which is populated with the string house number. This is done to
> account for any string characters that might be embedded in the house
> number.
>
> But is there a shorter workaround that wouldn't require TWO fields for
> each of the street name and the house number?
>


Assuming that there is no space between the street no and the street no
ext (and in future you should validate this before you write it away) you
could do something like this which will right justify the string in its
field for a correct "numeric" sort on a character field. Street nos
without an extension will be padded on the right with a space so that 212
and 212a line up.

If you ever need to seek a street no you will need to pad it on the left.

go top
do while .not. EOF()
mcA1 = substr(addr1,1,at(" ",addr1)-1)
if isalpha(right(mcA1,1
))
mcA1=mcA1 && there is a street no ext so leave it alone
else
mcA1=mcA1+' ' && pad on right if there is no street no ext
endif
replace A1 with right(' '+mcA1,7) && right justify in field
skip
enddo

or

replace all A1 with iif(isalpha(right(su
bstr(addr1,1,at('
',addr1)-1),1)),right(' '+substr(addr1,1,at(
'
',addr1)-1),7),right(' '+substr(addr1,1,at(
' ',addr1),7))

Watch line wrap. I've assumed a width of 7 for A1. Adjust as necessary
and make sure there are at least as many spaces to concatenate in right().


Mervyn


--
The ns_ in my address is an attempt to thwart SPAM.
V Harris

2005-08-08, 11:25 am

Mervyn Bick Wrote:

>
> Assuming that there is no space between the street no and the street no
> ext (and in future you should validate this before you write it away) you
> could do something like this which will right justify the string in its
> field for a correct "numeric" sort on a character field. Street nos
> without an extension will be padded on the right with a space so that 212
> and 212a line up.
>
> If you ever need to seek a street no you will need to pad it on the left.
>


I guess there is no way to get just one index that displays both the alpha street names in alpha order and the numeric street names in numeric order, AND then use that same index to seek both alpha and numeric street names. For seeking, either the number
will need to be padded -- as you say -- or the seek string must be evaluated as to whether it is alpha or numeric, and either an alpha or numeric index used to perform the seek. Is that right?

I wonder if it is possible, instead of converting the numeric street names to alpha, to convert the alpha street names to some type of numeric string and index on numeric values?

I know that:
val(3RD ST) = 3
val(THIRD ST) = 0

But is there a way to convert (THIRD ST) to, say, ASCII, and then index?

Here are some possible strings from the field ADDR1. Note that there are always two spaces <space> <space> in front of the street name which the index uses to mark the start of the street name. There is always one space <space> between the house number
and the street direction. Also, there is always an underscore <_> preceeding extra stuff that is irrelevant to the sort order and so is excluded.

43 S 11TH AVE
317 N 5TH ST
317B N 5TH ST _71601
317-1/2 N FREDERICK ST_APT 7
317-3/4B E SECOND AVE _71603
317-1/4-C DONEGAL PKWY NE
317 E AVENUE A_APT 1 _71604

Thanks,

V Harris


Mervyn Bick

2005-08-08, 8:24 pm

On Mon, 08 Aug 2005 17:42:06 +0200, V Harris <VHarris001@aol.com> wrote:

.....
> I guess there is no way to get just one index that displays both the
> alpha street names in alpha order and the numeric street names in
> numeric order, AND then use that same index to seek both alpha and
> numeric street names. For seeking, either the number will need to be
> padded -- as you say -- or the seek string must be evaluated as to
> whether it is alpha or numeric, and either an alpha or numeric index
> used to perform the seek. Is that right?


The padding was refering to seeking a street number as the data is right
justified in the street number field. It is not really relevant at the
moment as the street number is the least significant part of your index
and you can't seek only a street number using it unless you build another
index. You would have to seek using a string consisting of
street+direction+num
ber with street and direction right padded and number
left padded if you wanted to find a specific number in a specific street
with the present index.

>
> I wonder if it is possible, instead of converting the numeric street
> names to alpha, to convert the alpha street names to some type of
> numeric string and index on numeric values?
>
> I know that:
> val(3RD ST) = 3
> val(THIRD ST) = 0
>
> But is there a way to convert (THIRD ST) to, say, ASCII, and then index?


If it is not important to keep 3RD ST exactly like that then you could
change 3RD ST to THIRD ST before you write it to A4 (or wherever). If you
have to keep 3RD ST then you could add a field to hold THIRD ST (and all
other street names which will simply be copied for non-number street
names) for index purposes only. To populate this use code based on
NUM2WORDS in convert.cc in the dUFLP. (It's in LIW55B.ZIP which you can
get from http://www.goldenstag.net/dbase/ ) Alternatively you could
change THIRD ST to 3RD ST in the new field which would give you all the
numerically named streets together at the top of the list. (There isn't a
WORDS2NUM in convert.cc so you would have to write it from scratch.)

> Here are some possible strings from the field ADDR1. Note that there
> are always two spaces <space> <space> in front of the street name which
> the index uses to mark the start of the street name. There is always
> one space <space> between the house number and the street direction.
> Also, there is always an underscore <_> preceeding extra stuff that is
> irrelevant to the sort order and so is excluded.
>
> 43 S 11TH AVE
> 317 N 5TH ST
> 317B N 5TH ST _71601
> 317-1/2 N FREDERICK ST_APT 7
> 317-3/4B E SECOND AVE _71603
> 317-1/4-C DONEGAL PKWY NE
> 317 E AVENUE A_APT 1 _71604


The code I posted before for sorting out the street number was based on
having a single letter as a possible street number extension so it won't
quite cut it where there is a "-" and/or a "/" in the extension. If the
Donegal Pkwy address is the worst case then plain numbers will have to be
right padded with 6 spaces. The simple ISALPHA() test will not be enough
to check for a street number extension (the Frederick St address would
fall down) so it will have to be combined with checks for - and / and the
right padding will have to be calculated in each case.

I don't have the time right now to put some code together but if this is
not enough for you to sort it out yourself I'll be able to look at it
again tomorrow.

Mervyn

--
The ns_ in my address is an attempt to thwart SPAM.
Mervyn Bick

2005-08-09, 11:24 am

On Mon, 08 Aug 2005 17:42:06 +0200, V Harris <VHarris001@aol.com> wrote:


> I wonder if it is possible, instead of converting the numeric street
> names to alpha, to convert the alpha street names to some type of
> numeric string and index on numeric values?
>
> I know that:
> val(3RD ST) = 3
> val(THIRD ST) = 0
>
> But is there a way to convert (THIRD ST) to, say, ASCII, and then index?


Anything is possible, some things just take a bit more work. <g>

The code below should give you a starting point but there is a lot more
typing to do depending on how high your numbered streets go to. It works
on the sample data you posted but bear in mind the definition of test
data. (Test data is the only data a program is guaranteed to work with.)

If you index on a4+a2+a1 the results will be presented :-

1ST AVE
FIRST AVE
1ST ST
FIRST ST
2ND AVE
.....
ALPHA AVE
ALPHA ST
.....
BETA AVE
BETA ST
.....
and so on

If you index on a5+a4+a2+a1 the results will be presented :-

1ST AVE
1ST ST
2ND AVE
.....
FIRST AVE
FIRST ST
SECOND AVE
.....
ALPHA AVE
ALPHA ST
.....



The code is "quick and dirty" which I consider acceptable for a once off
operation. If, however, you are regularly going to recieve data, over
which you have no control, for processing then it needs to be made much
more robust with a whole bunch of error checking.
(What happens if some of the data is in lower case or if the double space
in front of a street name is missing, etc.)

Mervyn

--
The ns_ in my address is an attempt to thwart SPAM.



cDBASE = 'testdb'
use &cDBASE
do while .not. eof()
mcA1 = rtrim(substr(addr1,1
,at(' ',addr1)))
mcA3 = substr(addr1,at(' ',addr1)+2)
if at('_',mcA3) <> 0
mcA3 = substr(mcA3,1,at('_'
,mcA3)-1)
endif
mcA4 = mcA3
word2num(mcA4)
if val(mca3)> 0
mcA5 = 'N'
else
mcA5 = 'W'
endif
if at(' ',mcA1) <> 0
mcA2 = substr(mcA1,at(' ',mcA1))
mcA1 = left(mcA1,at(' ',mcA1)-1)
else
mcA2 = ' '
endif
if at('-',mcA1)<>0 .or. at('/',mcA1)<>0 .or. isalpha(right(mcA1,1
))
mcA1len = len(mcA1)
mcA1minus = at('-',mcA1)
mcA1slash = at('/',mcA1)
if isalpha(right(mcA1,1
))
mcA1alpha = 1
else
mcA1alpha = 0
endif
if mcA1minus <>0 .and. mcA1slash <> 0
mcA1nonnum = min(mcA1minus,mca1sl
ash)
endif
if mcA1minus = 0
mcA1nonnum = mcA1slash
endif
if mcA1slash = 0
mcA1nonnum = mcA1slash
endif
if mcA1nonnum <> 0
mCa1pad = 6 - (mcA1len-mcA1nonnum+1)
else
mcA1pad = 6 - mcA1alpha
endif
mcA1 = right(space(20)+mcA1
+space(mcA1pad),20)
else
mcA1 = right(space(20)+mcA1
+space(6),20)
endif
replace a1 with mcA1, a2 with mcA2, a3 with mcA3, a4 with mcA4, a5 with
mcA5
skip
enddo

index on a5+a4+a2+a1 to &cDBASE
&&index on a4+a2+a1 to &cDBASE

list
procedure word2num (lcA4)


lcA4strn = substr(lcA4,1,at(' ',lcA4)-1)
lcA4st = substr(lcA4,at(' ',lcA4))
if val(lcA4strn) > 0
lcA4 = str(val(lcA4strn),2)
+ lcA4st
endif
do case val(lca4strn) - 0
case lcA4strn = 'FIRST'
lcA4 = ' 1'+ lca4st
case lcA4strn = 'SECOND'
lcA4 = ' 2'+ lca4st
case lcA4strn = 'THIRD'
lcA4 = ' 3'+ lca4st
case lcA4strn = 'FOURTH'
lcA4 = ' 4'+ lca4st
case lcA4strn = 'FIFTH'
lcA4 = ' 5'+ lca4st

&& and so on

endcase
return (lcA4)
Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com