Home > Archive > SQL Anywhere Mobile > May 2005 > Uploading only changed columns instead of entire row









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Uploading only changed columns instead of entire row
Brian Greiman

2005-04-15, 1:24 pm

ASA 9.01.1964

Is there a way to only update columns that have changed from remote logs
instead of contents of entire row?

Here is an example of an issue I am seeing:
1. We have a customer master table with 4 columns (customer, customer_name,
address_1, address_2).
2. User on consolidated database updates address_1.
3. User on remote database updates address_2.
4. User on remote database runs MobiLink to upload.
5. Consolidated users updates is now wiped out from data from remote.

Is there any way to preserve the changes made on consolidated and only
update if column data has changed?

Thanks in Advance for any help / suggestions.

Brian


Brian Greiman

2005-04-15, 1:24 pm

One addition. It would be nice for the upload to use the functionality of
dbremote where it only updates the data that has changed (not the entire
row).

Brian


"Brian Greiman" <bgreiman@bernicks.com> wrote in message
news:425ff1e8@forums
-2-dub...
> ASA 9.01.1964
>
> Is there a way to only update columns that have changed from remote logs
> instead of contents of entire row?
>
> Here is an example of an issue I am seeing:
> 1. We have a customer master table with 4 columns (customer,

customer_name,[color
=darkred]
> address_1, address_2).
> 2. User on consolidated database updates address_1.
> 3. User on remote database updates address_2.
> 4. User on remote database runs MobiLink to upload.
> 5. Consolidated users updates is now wiped out from data from remote.
>
> Is there any way to preserve the changes made on consolidated and only
> update if column data has changed?
>
> Thanks in Advance for any help / suggestions.
>
> Brian
>
>[/color]


Breck Carter

2005-04-15, 1:24 pm

Use the "conflict resolution" techniques described in the
Help.

Breck

> ASA 9.01.1964
>
> Is there a way to only update columns that have changed
> from remote logs instead of contents of entire row?
>
> Here is an example of an issue I am seeing:
> 1. We have a customer master table with 4 columns
> (customer, customer_name, address_1, address_2).
> 2. User on consolidated database updates address_1.
> 3. User on remote database updates address_2.
> 4. User on remote database runs MobiLink to upload.
> 5. Consolidated users updates is now wiped out from data
> from remote.
>
> Is there any way to preserve the changes made on
> consolidated and only update if column data has changed?
>
> Thanks in Advance for any help / suggestions.
>
> Brian
>
>

Greg Fenton

2005-04-15, 8:24 pm

Brian Greiman wrote:
> One addition. It would be nice for the upload to use the functionality of
> dbremote where it only updates the data that has changed (not the entire
> row).


But *both* pieces of data has changed! col_1 at the remote, col_2 at
the consolidated. Which change is "right" depends *entirely* on the
application.

As Breck points out, you want to look in to "conflict resolution" to
apply whatever business logic you care to resolve this situation. The
default behaviour is "last synchronization wins".

greg.fenton
--
Greg Fenton
Consultant, Solution Services, iAnywhere Solutions
--------
Visit the iAnywhere Solutions Developer Community
Whitepapers, TechDocs, Downloads
http://www.ianywhere.com/developer/
Brian Greiman

2005-04-15, 8:24 pm

This seems like a major flaw with MobiLink. Let's say I had a customer
master table with 50 columns in it. An office user makes updates to
ar_balance, city, tax_number, fed_id and several other columns for a given
customer row. Later that day, a remote user changes saleperson column for
this customer on his/her laptop and then does a full synchronization. The
result is that the office user's changes are wiped out by the data from the
remote user (even though remote did not change any of these columns). The
remote user wins on all columns on this row, not for his one updated column.

If I implement conflict resolution, as understand it I need to code
upload_update, upload_fetch, upload_old_row_inser
t, upload_new_row_inser
t
events. The update event would be my default update script. Upload_fetch
would get data from consolidated database to test if a conflict exists,
upload_old_row_inser
t will log data from current consolidated values,
upload_new_row_inser
t will log data from remote user. Finally,
resolve_conflict will update data in consolidated based on script specified
there.

My problem is that this seems like a lot of overhead to setup these scripts
for all tables needing this functionality. Also and more importantly, how
can I tell from my resolve_conflict event, what data to update to be the
most accurate. In my customer master, we have 50+ columns, some can be in
both consolidated and remotes. How can I be able to tell what to keep and
throw away if I don't know what columns have been updated by whom. I don't
want to maintian separate change timestamps for each column.

I don't beleive MobiLink has this capability, but I think a good solution
would be to allow for the upload-stream to read through the log file and
process uploads / deletes / inserts much like dbremote does it. Only the
physical columns that are updated are sent up to consolidated. Is there any
such capability? I know I could translate log files with -is local
parameter and send those up to consolidated but that will involve more
manual steps and not use built-in functionalities of MobiLink.

We are in the process of migrating our DBRemote installation to MobiLink and
have deployed about 6 of 60 clients before I came across this issue.
DBRemote was not doing the job for us and we found MobiLink to be way more
efficient for our needs.

Any suggestions or ideas are greatly appreciated.

Thanks again,
Brian


<Breck Carter> wrote in message news:42600323.67b1.1681692777@sybase.com...[color=darkred]
> Use the "conflict resolution" techniques described in the
> Help.
>
> Breck
>


Greg Fenton

2005-04-15, 8:24 pm

Brian Greiman wrote:
>
> My problem is that this seems like a lot of overhead to setup these scripts
> for all tables needing this functionality. Also and more importantly, how
> can I tell from my resolve_conflict event, what data to update to be the
> most accurate. In my customer master, we have 50+ columns, some can be in
> both consolidated and remotes. How can I be able to tell what to keep and
> throw away if I don't know what columns have been updated by whom. I don't
> want to maintian separate change timestamps for each column.
>


As my other post points out, the problem is that only *you* can decide
what is the "right" updates. No system in the *world* could decide
generically which update is correct. Each situation will be different:

- field operator vs. central operator
- employee vs. manager
- batch process vs. human operator
- first change vs. last change
- *both* changes

In your case you are saying that both col_1 and col_2 are updated.
Fine, the ML server gives you access to all three states (old, new and
cons) and your stored procedure can decide which values it cares to keep
(e.g. new col_1 value in new, and new col_2 value in cons).

Yes, this *can* be a lot of work, but in reality it ends up being less
than 5% of tables in a typical synchronization scenario.

Even if ML only sent the "changed" rows, whose to say that those changed
values are the "right" ones? In some scenarios, one type of user's
changes trump the other's. In other scenarios, you want to calculate a
sum (or a difference) between changes and apply to the current values.
In some cases you need the unchanged values to determine whether to
apply the changed values.

Hope this helps,
greg.fenton
--
Greg Fenton
Consultant, Solution Services, iAnywhere Solutions
--------
Visit the iAnywhere Solutions Developer Community
Whitepapers, TechDocs, Downloads
http://www.ianywhere.com/developer/
Brian Greiman

2005-04-15, 8:24 pm

Wouldn't it be better to have "last synchronization wins" to be for each
column, not the entire row. We could have a user that does not update for a
week, makes one column change. Meanwhile, 5 columns have changed on
consolidated. User updates and because he/she updated last, original
(unupdated) columns win. This does not make sense to me.

Brian


"Greg Fenton" <greg. fenton_NOSPAM_@ianyw
here.com> wrote in message
news:42601120$1@foru
ms-1-dub...
> Brian Greiman wrote:
of[color=darkred]
>
> But *both* pieces of data has changed! col_1 at the remote, col_2 at
> the consolidated. Which change is "right" depends *entirely* on the
> application.
>
> As Breck points out, you want to look in to "conflict resolution" to
> apply whatever business logic you care to resolve this situation. The
> default behaviour is "last synchronization wins".
>
> greg.fenton
> --
> Greg Fenton
> Consultant, Solution Services, iAnywhere Solutions
> --------
> Visit the iAnywhere Solutions Developer Community
> Whitepapers, TechDocs, Downloads
> http://www.ianywhere.com/developer/



Brian Greiman

2005-04-15, 8:24 pm

I like the idea of last update wins, but I need this logic by column, not by
the entire row. Like I said in previous post, we could have a user update
one column in row, synchronization occurs and even though 50 other columns
on row may have changed in consolidated, the entire row on consolidated will
be replaced by current contents of row from remote. Wouldn't it be better
to only update "updated columns" instead of the entire row?

I don't fully understand why we would want to upload the entire row instead
of just column updates. Best case for me would be to allow for sending SQL
statements of changed columns / data and primary keys as is used in
dbremote. What value is there to uploading the entire row (including values
that have not changed)? I understand the need for certain updates to trump
others or need to be calculated (ie. inventory updates), but why does the
entire row need to be sent?

Hope to hear more ideas / suggestions / discussion. Still not fully
understanding why MobiLink is built this way.

Brian



"Greg Fenton" <greg. fenton_NOSPAM_@ianyw
here.com> wrote in message
news:42602e08$1@foru
ms-1-dub...
> Brian Greiman wrote:
scripts[color=darkre
d]
how[color=darkred]
in[color=darkred]
and[color=darkred]
don't[color=darkred]

>
> As my other post points out, the problem is that only *you* can decide
> what is the "right" updates. No system in the *world* could decide
> generically which update is correct. Each situation will be different:
>
> - field operator vs. central operator
> - employee vs. manager
> - batch process vs. human operator
> - first change vs. last change
> - *both* changes
>
> In your case you are saying that both col_1 and col_2 are updated.
> Fine, the ML server gives you access to all three states (old, new and
> cons) and your stored procedure can decide which values it cares to keep
> (e.g. new col_1 value in new, and new col_2 value in cons).
>
> Yes, this *can* be a lot of work, but in reality it ends up being less
> than 5% of tables in a typical synchronization scenario.
>
> Even if ML only sent the "changed" rows, whose to say that those changed
> values are the "right" ones? In some scenarios, one type of user's
> changes trump the other's. In other scenarios, you want to calculate a
> sum (or a difference) between changes and apply to the current values.
> In some cases you need the unchanged values to determine whether to
> apply the changed values.
>
> Hope this helps,
> greg.fenton
> --
> Greg Fenton
> Consultant, Solution Services, iAnywhere Solutions
> --------
> Visit the iAnywhere Solutions Developer Community
> Whitepapers, TechDocs, Downloads
> http://www.ianywhere.com/developer/



Greg Fenton

2005-04-15, 8:24 pm

Brian Greiman wrote:
> Wouldn't it be better to have "last synchronization wins" to be for each
> column, not the entire row. We could have a user that does not update for a
> week, makes one column change. Meanwhile, 5 columns have changed on
> consolidated. User updates and because he/she updated last, original
> (unupdated) columns win. This does not make sense to me.


Though this *could* be the way ML works, it isn't. Column-changed
synchronizations, though doable, make synchronization problems *harder*.
For example, how does one write a synchronization script to handle a
*variable* number of columns?

I suggest you post your thoughts to the forum where features requests
are discussed:

sybase.public.sqlanywhere. product_futures_disc
ussions

greg.fenton
--
Greg Fenton
Consultant, Solution Services, iAnywhere Solutions
--------
Visit the iAnywhere Solutions Developer Community
Whitepapers, TechDocs, Downloads
http://www.ianywhere.com/developer/
Greg Fenton

2005-04-15, 8:24 pm

Brian Greiman wrote:
> Wouldn't it be better
> to only update "updated columns" instead of the entire row?
>


Actually, following proper RDBMS schema design practices, the answer is
likely "no". At the third normal form (see doc reference below), your
schema is almost certainly going to be done in such a way that a change
to a row at one site will want to be entirely synchronized elsewhere.
If data is not directly related to the primary key of the given table,
then it likely belongs in a separate (though possibly related) table.
For example, an address typically should not be in a customer table as
the address is (typically) a separate entity from the customer [though
this is a flame war in the making :-) ]

So if the schema is properly normalized, the number of conflicts will be
minimalized and a change in one node is likely to be *entirely*
propogated to other nodes.

That being said, in reality distributed schemas are not always as
normalized as they could be...

So here is where I am able to say that this problem occurs typically in
(less than) 5% of tables. In order for a conflict to occur, the
following must hold true:

- the row must be a shared row amongst multiple nodes
- the row must be updateable by multiple nodes

More often than not, a table is a "one way" table (e.g. download-only
code tables) or it is modified only at one node (a customer record
'belongs' to one remote and so will only change there).


> I don't fully understand why we would want to upload the entire row instead
> of just column updates.


Honestly, because it is "easier", both for the ML engineers as well as
for those of us who write synchronization scripts. Writing scripts to
handle a variable number of rows would be hard and/or an awful lot of work.

> What value is there to uploading the entire row (including values
> that have not changed)?


If we don't upload them, then you have no way of identifying that a
value in the *consolidated* has not changed. Note that a new value in
the consolidated does not necessarily mean that it was updated
(originally) at the consolidated...it may have been updated by another
remote.

So without the "old" values not only do we have less information in
which to apply business logic, we also are unable to filter out
unmodified rows from the download stream. If we update a row in the
consolidated, we would have to re-download the *entire* row because we
can't tell if the *other* rows have been modified or not. So now we
have the same problem of amount of data being synchronized, we've just
made the *download* bigger (and potentially much bigger as I upload a
row with 2 changed columns but have to download all 50 columns...)

Again, I suggest you post thoughts to the product_futures_disc
ussion group.

greg.fenton
--
Greg Fenton
Consultant, Solution Services, iAnywhere Solutions
--------
Visit the iAnywhere Solutions Developer Community
Whitepapers, TechDocs, Downloads
http://www.ianywhere.com/developer/
Greg Fenton

2005-04-15, 8:24 pm

Greg Fenton wrote:
>
> Actually, following proper RDBMS schema design practices, the answer is
> likely "no". At the third normal form (see doc reference below), your


Oops....and now for that doc reference. This is from the ASA 9.x online
docs:

ASA SQL User's Guide
Designing Your Database
The design process
- Step 3: Normalize the data

greg.fenton
--
Greg Fenton
Consultant, Solution Services, iAnywhere Solutions
--------
Visit the iAnywhere Solutions Developer Community
Whitepapers, TechDocs, Downloads
http://www.ianywhere.com/developer/
Breck Carter [TeamSybase]

2005-04-16, 9:23 am

The extended upload_update script is a newer and more powerful
alternative to the upload_fetch script; it is passed the new values
for the non-primary-keys, the primary-key values, and then the old
values from the non-primary-key values. In that script you can
retrieve the current values of tne non-primary-keys from the database
so you have all three values for each non-primary-key column and you
can handle it however you want.

Alternatively, you can use upload_old_row_inser
t and
upload_new_row_inser
t scripts (which are easy to write, and their
creation can be automated via UNLOAD SELECT STRING ( ... ) FROM
SYSTABLE etcetera TO script-file) plus a resolve_conflict script where
the intellectual content comes in. That script gets the table-name so
you can write one procedure to handle multiple tables if you favor
that style of script. For more information, see my book. :)

If you have an environment where multiple users can update the same
column in the same table, for *many* different tables, it's definitely
out of the ordinary. Speaking from experience, the
last-uploaded-update-wins default works for most applications; they
have no need for conflict resolution at all, and some need it for only
a few tables.

How many of your tables can be updated this way?

Breck



On 15 Apr 2005 13:20:18 -0700, "Brian Greiman" <bgreiman@bernicks.com>
wrote:

>This seems like a major flaw with MobiLink. Let's say I had a customer
>master table with 50 columns in it. An office user makes updates to
>ar_balance, city, tax_number, fed_id and several other columns for a given
>customer row. Later that day, a remote user changes saleperson column for
>this customer on his/her laptop and then does a full synchronization. The
>result is that the office user's changes are wiped out by the data from the
>remote user (even though remote did not change any of these columns). The
>remote user wins on all columns on this row, not for his one updated column.
>
>If I implement conflict resolution, as understand it I need to code
>upload_update, upload_fetch, upload_old_row_inser
t, upload_new_row_inser
t
>events. The update event would be my default update script. Upload_fetch
>would get data from consolidated database to test if a conflict exists,
> upload_old_row_inser
t will log data from current consolidated values,
> upload_new_row_inser
t will log data from remote user. Finally,
>resolve_conflict will update data in consolidated based on script specified
>there.
>
>My problem is that this seems like a lot of overhead to setup these scripts
>for all tables needing this functionality. Also and more importantly, how
>can I tell from my resolve_conflict event, what data to update to be the
>most accurate. In my customer master, we have 50+ columns, some can be in
>both consolidated and remotes. How can I be able to tell what to keep and
>throw away if I don't know what columns have been updated by whom. I don't
>want to maintian separate change timestamps for each column.
>
>I don't beleive MobiLink has this capability, but I think a good solution
>would be to allow for the upload-stream to read through the log file and
>process uploads / deletes / inserts much like dbremote does it. Only the
>physical columns that are updated are sent up to consolidated. Is there any
>such capability? I know I could translate log files with -is local
>parameter and send those up to consolidated but that will involve more
>manual steps and not use built-in functionalities of MobiLink.
>
>We are in the process of migrating our DBRemote installation to MobiLink and
>have deployed about 6 of 60 clients before I came across this issue.
>DBRemote was not doing the job for us and we found MobiLink to be way more
>efficient for our needs.
>
>Any suggestions or ideas are greatly appreciated.
>
>Thanks again,
>Brian
>
>
><Breck Carter> wrote in message news:42600323.67b1.1681692777@sybase.com...
>


--
SQL Anywhere Studio 9 Developer's Guide
Buy the book: http://www.amazon.com/exec/obidos/A...7/risingroad-20
bcarter@risingroad.com
RisingRoad SQL Anywhere and MobiLink Professional Services
www.risingroad.com
Jim

2005-04-20, 11:24 am

I'm no expert on this, but we've had some discussion about this very thing.
One potential issue we've thought of regarding column-based syncing is what
if the change in column x by user1 is at least partially based on
information in column y, which later gets changed by user2 on the
consolidated? So now data exists in column x that user 1 never would have
been put there if column y had not been changed.


"Greg Fenton" <greg. fenton_NOSPAM_@ianyw
here.com> wrote in message
news:42604a03$1@foru
ms-2-dub...
> Brian Greiman wrote:
for a[color=darkred]
>
> Though this *could* be the way ML works, it isn't. Column-changed
> synchronizations, though doable, make synchronization problems *harder*.
> For example, how does one write a synchronization script to handle a
> *variable* number of columns?
>
> I suggest you post your thoughts to the forum where features requests
> are discussed:
>
> sybase.public.sqlanywhere. product_futures_disc
ussions
>
> greg.fenton
> --
> Greg Fenton
> Consultant, Solution Services, iAnywhere Solutions
> --------
> Visit the iAnywhere Solutions Developer Community
> Whitepapers, TechDocs, Downloads
> http://www.ianywhere.com/developer/



David Fishburn

2005-05-06, 1:24 pm

"Brian Greiman" <bgreiman@bernicks.com> wrote in
news:42602202$1@foru
ms-1-dub of sybase.public.sqlanywhere.mobilink:

BG> This seems like a major flaw with MobiLink. Let's say I had a
BG> customer master table with 50 columns in it. An office user makes
BG> updates to ar_balance, city, tax_number, fed_id and several other
BG> columns for a given customer row. Later that day, a remote user
BG> changes saleperson column for this customer on his/her laptop and then
BG> does a full synchronization. The result is that the office user's
BG> changes are wiped out by the data from the remote user (even though
BG> remote did not change any of these columns). The remote user wins on
BG> all columns on this row, not for his one updated column.

Yes we agree, and that is why we offer you *row level* and *column level*
conflict detection and resolution in our next major release.

Using column level resolution will only update the consolidated database
with those columns the remote changed, whereas row level is the behaviour
you are currently seeing.

HTH,
--
David Fishburn
Certified ASA Developer Version 8
iAnywhere Solutions - Sybase
Professional Services
Please only post to the newsgroup
Please ALWAYS include version and MORE importantly BUILD number with
EACH post (dbeng9 -v).

EBFs and Maintenance Releases
http://downloads.sybase.com/swx/sdmain.stm

Developer Community / Whitepapers
http://www.ianywhere.com/developer

CaseXpress - to report bugs
http://casexpress.sybase.com

CodeXchange - Free samples
[url]http://ianywhere.codexchange.sybase.com/servlets/ ProjectDocumentList[
/url]

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com