Home > Archive > ASE Database forum > October 2005 > What does ASE 12.5.3 ESD#4 CR 369253 really mean?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author What does ASE 12.5.3 ESD#4 CR 369253 really mean?
FlyBean

2005-10-27, 8:21 am

This content of this CR is

> Improve ASE's network infrastructure to prevent an inability to connect to ASE resulting from a hung engine.


So what does it really mean? I met some troubles which point to the network of
ASE directly or in-directly since ASE 12.5.1.

So could SYBASE tell us more about this CR?

Thanks a lot.

Flybean
David Wein

2005-10-27, 8:21 am

I can provide some info on this CR.

Prior to the implementation of CR 369253 you could get into a situation
where a single hanging engine could prevent logins to the entire server.
ASE's logic for handling incoming connections includes having engine 0
detect that there is a new connection and then having a network listener run
on the engine with the fewest number of active connections. This listener
is responsible for accepting all pending connections at the time it runs,
and engine 0 will not be looking for new connections until this listener has
run. The problem comes in when the engine selected to run the listener is
hangs before the listener can run. This hang could be due to a number of
things such as an unresponsive system call or task that is stuck in a loop.
As long as that engine is hung the listener is stuck waiting to run and no
connections can come in.

CR 369253 imrpoves upon this situation. ASE will now detect that the
listener is waiting to run on an engine that is hung and will essentially
reschedule it to run on another engine.

This situation shouldn't be a common occurance. I saw this happen a small
handful of times over a number of years and is more of a side issue than the
true root of a problem (i.e. why did your engine hang in the first place?).
However it does come down to the fault tolerance of ASE so it was definitely
worthwhile to implement.

As a side note, there have been several fixes directly related to the
reliability of network listeners since the 12.5.1 release, so I would
strongly recommend the latest 12.5.3 release.

-Dave

"FlyBean" <flybean@sybasebbs.com> wrote in message
news:434a2064@forums
-2-dub...
> This content of this CR is
>
to ASE resulting from a hung engine.[color=darkred]
>
> So what does it really mean? I met some troubles which point to the

network of
> ASE directly or in-directly since ASE 12.5.1.
>
> So could SYBASE tell us more about this CR?
>
> Thanks a lot.
>
> Flybean



FlyBean

2005-10-27, 8:21 am

Sorry for sending mail to you directly.

David Wein wrote:
> I can provide some info on this CR.
>
> Prior to the implementation of CR 369253 you could get into a situation
> where a single hanging engine could prevent logins to the entire server.
> ASE's logic for handling incoming connections includes having engine 0
> detect that there is a new connection and then having a network listener run
> on the engine with the fewest number of active connections. This listener
> is responsible for accepting all pending connections at the time it runs,
> and engine 0 will not be looking for new connections until this listener has
> run. The problem comes in when the engine selected to run the listener is
> hangs before the listener can run. This hang could be due to a number of
> things such as an unresponsive system call or task that is stuck in a loop.
> As long as that engine is hung the listener is stuck waiting to run and no
> connections can come in.
>
> CR 369253 imrpoves upon this situation. ASE will now detect that the
> listener is waiting to run on an engine that is hung and will essentially
> reschedule it to run on another engine.
>
> This situation shouldn't be a common occurance. I saw this happen a small
> handful of times over a number of years and is more of a side issue than the
> true root of a problem (i.e. why did your engine hang in the first place?).
> However it does come down to the fault tolerance of ASE so it was definitely
> worthwhile to implement.
>
> As a side note, there have been several fixes directly related to the
> reliability of network listeners since the 12.5.1 release, so I would
> strongly recommend the latest 12.5.3 release.
>


Yes, you are right. Some days before it's said that ASE 12.5.2 ESD2 will let the
trouble(listener lost) game-over. It seems not correct according to my
experience. And also I checked the Targeted CR-List for ASE
12.5.3/ESD1/ESD2/ESD3, seems nothing about that. Finally I saw this CR. So I do
want to know if it could help me from out of the lair.

Here, I re-descript the trouble I met again.(You should search the newsgroup)

First, lots 1608 errors in the log.
Under ASE 12.5, it's just OK.
While upgraded to ASE 12.5.1, the listener often lost. So upgraded to ASE
12.5.2ESD2. After that, the ASE often reported timeslice error with the
following stack trace(and also lots of the 1608).

> 00:00000:00000:2004/11/29 17:20:20.60 kernel timeslice -500, current process infected
> 01:00000:00396:2004/11/29 17:20:20.60 kernel SQL Server system exception (0xe0000001) generated by a process exceeding its
> time slice allotment.
> 01:00000:00396:2004/11/29 17:20:20.96 kernel pc: 0x77E349D3 kernel32.dll (0xE0000001, 0x00000001, 0x00000000, 0x00000000)
> 01:00000:00396:2004/11/29 17:20:20.96 kernel pc: 0x77E349D3 kernel32.dll (0x202F5920, 0x24347800, 0x2437FCA8, 0x2437FD06)
> 01:00000:00396:2004/11/29 17:20:20.96 kernel pc: 0x00BA8995 AseHTTPURLInputStrea
m::readBytes+ 0x7cd7d (0x18F901FA,
> 0xFFFFFFFF, 0xFFFFFFFF, 0x2437FCA8)
> 01:00000:00396:2004/11/29 17:20:20.96 kernel pc: 0x00BE6BD4 AseHTTPURLInputStrea
m::readBytes+ 0xbafbc (0x18F901FA,
> 0x0291FF48, 0x013F3D24, 0x004C0A14)
> 01:00000:00396:2004/11/29 17:20:20.96 kernel pc: 0x00BBCAFE AseHTTPURLInputStrea
m::readBytes+ 0x90ee6 (0x18F901FA,
> 0x00000000, 0x00BF1A10, 0x202F5360)
> 01:00000:00396:2004/11/29 17:20:20.96 kernel pc: 0x004078D2 (Symbol not found)(0x23300984, 0x0291FA7C, 0x00BC900F,
> 0x23300984)
> 01:00000:00396:2004/11/29 17:20:20.96 kernel pc: 0x00422F5E (Symbol not found)(0x23300984, 0x00BC89E7, 0x2437FCA8,
> 0x243BFD60)
> 01:00000:00396:2004/11/29 17:20:20.96 kernel pc: 0x00422F70 (Symbol not found)(0x2330F830, 0x23300984, 0x00000000,
> 0x00000000)
> 01:00000:00396:2004/11/29 17:20:20.96 kernel pc: 0x00BC900F AseHTTPURLInputStrea
m::readBytes+ 0x9d3f7 (0x00000099,
> 0x00000000, 0x00000000, 0x2437FCA8)
> 01:00000:00396:2004/11/29 17:20:20.96 kernel [Handler pc: 0x0047BC84 (Symbol not found) installed by the following
> function:-]
> 01:00000:00396:2004/11/29 17:20:20.96 kernel [Handler pc: 0x006F8E10 (Symbol not found) installed by the following
> function:-]
> 01:00000:00396:2004/11/29 17:20:20.96 kernel [Handler pc: 0x006F8E10 (Symbol not found) installed by the following
> function:-]
> 01:00000:00396:2004/11/29 17:20:20.96 kernel pc: 0x004210E1 (Symbol not found)(0x2437FCA8, 0x00000000, 0x00000000,
> 0x2437FCA8)
> 01:00000:00396:2004/11/29 17:20:20.96 kernel pc: 0x00BA88D7 AseHTTPURLInputStrea
m::readBytes+ 0x7ccbf (0x00000000,
> 0x00000000, 0x00000000, 0x00000000)
> 01:00000:00396:2004/11/29 17:20:20.99 kernel pc: 0x77E1A990 kernel32.dll (0x00000000, 0x00000000, 0x00000000, 0x00000000)
> 01:00000:00396:2004/11/29 17:20:20.99 kernel end of stack trace, spid 396, kpid 0, suid 1
> 01:00000:00396:2004/11/29 17:20:20.99 kernel ********************
****************
> 01:00000:00396:2004/11/29 17:20:20.99 kernel pc: 0x00C2DBED AseHTTPURLInputStrea
m::readBytes+ 0x101fd5 (0x0291EE14,
> 0x77E40ABC, 0x77E4FD48, 0xFFFFFFFF)
> 01:00000:00396:2004/11/29 17:20:20.99 kernel pc: 0x00C2DBED AseHTTPURLInputStrea
m::readBytes+ 0x101fd5 (0x0291EE14,
> 0x0291EBCC, 0x0000270F, 0x00000002)
> 01:00000:00396:2004/11/29 17:20:20.99 kernel pc: 0x00C0463A AseHTTPURLInputStrea
m::readBytes+ 0xd8a22 (0x18F901FA,
> 0x00000002, 0x0000270F, 0x00000000)
> 01:00000:00396:2004/11/29 17:20:21.01 kernel pc: 0x00C042C4 AseHTTPURLInputStrea
m::readBytes+ 0xd86ac (0x18F901FA,
> 0x00000001, 0x00000000, 0x00000000)
> 01:00000:00396:2004/11/29 17:20:21.01 kernel pc: 0x00BB384B AseHTTPURLInputStrea
m::readBytes+ 0x87c33 (0x18F901FA,
> 0xFFFFFFFF, 0x0291F37C, 0xE0000001)
> 01:00000:00396:2004/11/29 17:20:21.01 kernel pc: 0x00406974 (Symbol not found)(0xE0000001, 0x77B94DB1, 0x0291F3B0,
> 0x00000000)
> 01:00000:00396:2004/11/29 17:20:21.01 kernel pc: 0x00BA7D54 AseHTTPURLInputStrea
m::readBytes+ 0x7c13c (0x2437FCA8,
> 0x00000000, 0x00000000, 0x2437FCA8)
> 01:00000:00396:2004/11/29 17:20:21.01 kernel pc: 0x00BA8957 AseHTTPURLInputStrea
m::readBytes+ 0x7cd3f (0x00000000,
> 0x00000000, 0x00000000, 0x00000000)
> 01:00000:00396:2004/11/29 17:20:21.01 kernel pc: 0x77E1A990 kernel32.dll (0x00000000, 0x00000000, 0x00000000, 0x00000000)
> 01:00000:00396:2004/11/29 17:20:21.01 kernel end of stack trace, spid 396, kpid 0, suid 1
> 00:00000:00000:2004/11/29 17:23:36.73 kernel secleanup:
> time to live expired on engine 1


During the monitoring, I found that it was often caused by one application. And
I noticed that after this error, the listener was gone! Then I tried a test:
using isql to login in at the server, and when the listener was gone, I
restarted a listener by sp_listener. Ok, new connections were enabled. But just
half an hour later, ASE restarted itself.
Then I turned my eyes on the 1608. While I read the readme.txt of OLEDB, I found
something as the following:
> Version 02.70.0010 (12.5.0/P-EBF10719/02.70.0010)
> ------------------------------------------------
>
> CR 294297: Oledb Provider would close the network socket after TDS_LOGOUT, without waiting for the TDS_DONE acknowledgement from the ASE.


The "murderer" do use OLEDB. Then I checked the version, it did be less than the
one this CR ref. So I asked the appilcation provider to upgrade the OLEDB
driver. After that this box is stable and the 1608 is gone! Sometimes someone
installed the application with old version. Then 1608 came out and the trouble too.

In our org, there are some nodes runs the same application, the same structure.
Some node met this trouble(with 1608 error), we killed it by upgrading the OLEDB
driver. But these days one node met the trouble without 1608 error. Since I
could not go to the field, I could not confirm that all clients are upgraded
correctly. The following is the errorlog:
> 00:00000:00000:2005/10/10 04:11:38.34 kernel timeslice -500, current process infected
> 02:00000:00022:2005/10/10 04:11:38.36 kernel SQL Server system exception (0xe0000001) generated by a process exceeding its time slice allotment.
> 02:00000:00022:2005/10/10 04:11:38.63 kernel pc: 0x77E349D3 kernel32.dll (0xE0000001, 0x00000001, 0x00000000, 0x00000000)
> 02:00000:00022:2005/10/10 04:11:38.63 kernel pc: 0x77E349D3 kernel32.dll (0x200FC020, 0x24481800, 0x244920C0, 0x2449211E)
> 02:00000:00022:2005/10/10 04:11:38.63 kernel pc: 0x00BA8995 AseHTTPURLInputStrea
m::readBytes+ 0x7cd7d (0x31BE0095, 0xFFFFFFFF, 0xFFFFFFFF, 0x244920C0)
> 02:00000:00022:2005/10/10 04:11:38.63 kernel pc: 0x00BE6AB5 AseHTTPURLInputStrea
m::readBytes+ 0xbae9d (0x31BE0095, 0x0288FF48, 0x013F3D24, 0x00000000)
> 02:00000:00022:2005/10/10 04:11:38.63 kernel pc: 0x00BBCAFE AseHTTPURLInputStrea
m::readBytes+ 0x90ee6 (0x31BE0095, 0x00000000, 0x245086A0, 0x000000CB)
> 02:00000:00022:2005/10/10 04:11:38.63 kernel pc: 0x004078D2 (Symbol not found)(0x21FB5694, 0x00000000, 0x00000000, 0x00000000)
> 02:00000:00022:2005/10/10 04:11:38.63 kernel pc: 0x00422F5E (Symbol not found)(0x000000CB, 0x00000000, 0x00000000, 0x244920C0)
> 02:00000:00022:2005/10/10 04:11:38.63 kernel [Handler pc: 0x0047BC84 (Symbol not found) installed by the following function:-]
> 02:00000:00022:2005/10/10 04:11:38.64 kernel [Handler pc: 0x006F8E10 (Symbol not found) installed by the following function:-]
> 02:00000:00022:2005/10/10 04:11:38.64 kernel [Handler pc: 0x006F8E10 (Symbol not found) installed by the following function:-]
> 02:00000:00022:2005/10/10 04:11:38.64 kernel pc: 0x004225EE (Symbol not found)(0x244920C0, 0x00000000, 0x00000000, 0x244920C0)
> 02:00000:00022:2005/10/10 04:11:38.64 kernel pc: 0x00BA88D7 AseHTTPURLInputStrea
m::readBytes+ 0x7ccbf (0x00000000, 0x00000000, 0x00000000, 0x00000000)
> 02:00000:00022:2005/10/10 04:11:38.66 kernel pc: 0x77E1A990 kernel32.dll (0x00000000, 0x00000000, 0x00000000, 0x00000000)
> 02:00000:00022:2005/10/10 04:11:38.66 kernel end of stack trace, spid 22, kpid 0, suid 30
> 02:00000:00022:2005/10/10 04:11:38.66 kernel ********************
****************
> 02:00000:00022:2005/10/10 04:11:38.66 kernel pc: 0x00C2DBED AseHTTPURLInputStrea
m::readBytes+ 0x101fd5 (0x0288EE58, 0x77E40ABC, 0x77E4FD48, 0xFFFFFFFF)
> 02:00000:00022:2005/10/10 04:11:38.66 kernel pc: 0x00C2DBED AseHTTPURLInputStrea
m::readBytes+ 0x101fd5 (0x0288EE58, 0x0288EC10, 0x0000270F, 0x00000002)
> 02:00000:00022:2005/10/10 04:11:38.66 kernel pc: 0x00C0463A AseHTTPURLInputStrea
m::readBytes+ 0xd8a22 (0x31BE0095, 0x00000002, 0x0000270F, 0x00000000)
> 02:00000:00022:2005/10/10 04:11:38.66 kernel pc: 0x00C042C4 AseHTTPURLInputStrea
m::readBytes+ 0xd86ac (0x31BE0095, 0x00000001, 0x00000000, 0x00000000)
> 02:00000:00022:2005/10/10 04:11:38.66 kernel pc: 0x00BB384B AseHTTPURLInputStrea
m::readBytes+ 0x87c33 (0x31BE0095, 0xFFFFFFFF, 0x0288F3C0, 0xE0000001)
> 02:00000:00022:2005/10/10 04:11:38.66 kernel pc: 0x00406974 (Symbol not found)(0xE0000001, 0x77B94DB1, 0x0288F3F4, 0x00000000)
> 02:00000:00022:2005/10/10 04:11:38.66 kernel pc: 0x00BA7D54 AseHTTPURLInputStrea
m::readBytes+ 0x7c13c (0x244920C0, 0x00000000, 0x00000000, 0x244920C0)
> 02:00000:00022:2005/10/10 04:11:38.66 kernel pc: 0x00BA8957 AseHTTPURLInputStrea
m::readBytes+ 0x7cd3f (0x00000000, 0x00000000, 0x00000000, 0x00000000)
> 02:00000:00022:2005/10/10 04:11:38.66 kernel pc: 0x77E1A990 kernel32.dll (0x00000000, 0x00000000, 0x00000000, 0x00000000)
> 02:00000:00022:2005/10/10 04:11:38.66 kernel end of stack trace, spid 22, kpid 0, suid 30
> 00:00000:00000:2005/10/10 04:14:57.15 kernel secleanup:


Unfortunately, the murderer is the same application. Really tired of this. Hope
12.5.3ESD4 should game over it.

Thanks a lot.

Btw, I kept the whole logs of this event. If you like, I can pass them to you,
of course in Chinese. :)

Flybean


Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com