[Vortex] weird behavior or misunderstanding of beep/vortex?

Gustavo Sverzut Barbieri Gustavo.Barbieri at indt.org.br
Fri Jun 9 21:19:51 CEST 2006


On Friday 09 June 2006 15:36, Gustavo Sverzut Barbieri wrote:
> On Friday 09 June 2006 11:51, Gustavo Sverzut Barbieri wrote:
> > On Friday 09 June 2006 11:31, you wrote:
> > > El vie, 09-06-2006 a las 09:26 -0300, Gustavo Sverzut Barbieri
> > > escribió: Hi Gustavo!
> > >
> > > > Now I face a double-free, my backtrace follows. It seems that we have
> > > > a race
> > > > condition in the test:
> > > >
> > > >         if ((data != NULL) && (data->message != NULL))
> > > >                 g_free (data->message);
> > > >
> > > >         if (data != NULL)
> > > >                 g_free (data);
> > > >
> > > >
> > > > As you may see, we have two problems there: thread can stop after
> > > > "data !=
> > > > NULL" and free data, which will cause failure or after
> > > > "data->message", also
> > > > failing. In my case I hit the problem at data->message.
> > > >
> > > > My solution to this kind of problem is to use an atomic instruction
> > > > to swap
> > > > things I'll free, like:
> > > >
> > > >         message = xchg( &data->message, NULL );
> > > >
> > > > this will make data->message NULL and will return previous contents,
> > > > then I
> > > > can check & free it.
> > > >
> > > > But this is just my guess... I need to invetigate further... what is
> > > > quite
> > > > difficult, since using Valgrind makes the problem go away :-D
> > >
> > > The race condition description and its possible solution looks fine.
> > > However, there is only one thread running the vortex sequencer loop at
> > > the same time, which is the only one entity running the
> > > __vortex_sequencer_unref_and_clear function.
> > >
> > > This makes difficult to produce a race condition inside that function,
> > > even having the thread suspended between the pointer checking and the
> > > pointer deallocation.
> > >
> > > Did the problem dissapear once applied the patch you propose? Did you
> > > modify the source code to start more than one vortex sequencers?
> > >
> > > If the problem persists maybe you can describe the steps to reproduce
> > > the problem.
> >
> > I'll try to isolate it.
> >
> > From my investigation so far, problem is that it free() data->message,
> > but not make it null... it's the same thread, so it's not a race
> > condition. I'm now memset( data, 0, sizeof(*data) ), but it start to
> > break elsewhere.
> >
> > My debug logs from __vortex_sequencer_run are:
> >
> > ### 0) begin: data=0xb3804c48 [pid=-1231029328]
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=4118
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 4118
> > 3) the end: data=(nil)
> > ### 0) begin: data=0xb38050e8 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > connection_send_msg: 0x80c8be8 (127.0.0.1:59845), channel 0x80cb960 (#5)
> > 0x80c1f60
> >
> > connection_send_msg: lock mutex 0x80c7438
> > ### 0) begin: data=0xb3804928 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=4122
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 4122
> > 3) the end: data=(nil)
> > ### 0) begin: data=0xb3804928 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=4122
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 4122
> > 3) the end: data=(nil)
> > ### 0) begin: data=0xb3804928 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=4123
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 4123
> > 3) the end: data=(nil)
> > ### 0) begin: data=0xb3804928 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=4123
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 4123
> > 3) the end: data=(nil)
> > ### 0) begin: data=0xb3804928 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=4123
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 4123
> > 3) the end: data=(nil)
> > ### 0) begin: data=0xb3804928 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=4123
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 4123
> > 3) the end: data=(nil)
> > ### 0) begin: data=0xb3804928 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=4123
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 4123
> > 3) the end: data=(nil)
> > ### 0) begin: data=0xb3804928 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=4123
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 4123
> > 3) the end: data=(nil)
> > ### 0) begin: data=0xb3804928 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=4123
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 4123
> > 3) the end: data=(nil)
> > ### 0) begin: data=0xb3804928 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=4123
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 4123
> > 3) the end: data=(nil)
> > ### 0) begin: data=0xb3804928 [pid=-1231029328]
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=0x80ca6c8, the_size=880
> > ### 3.1) channel#5 packet=0x80ca6b0
> > ### .... packet: type=1, no.: 0, len.: 880
> > 4) step3, data=0xb3804c48
> > ### ... start_resequence: 0xb3804c48
> > ### 3) step 2:
> > ### ... a_frame=(nil), the_size=0
> > ### 3.2) unref data=0xb3804c48
> > *** glibc detected *** double free or corruption (fasttop): 0xb3804c48
> > ***
> >
> > Program received signal SIGABRT, Aborted.
> > [Switching to Thread -1231029328 (LWP 14296)]
> > 0xffffe410 in __kernel_vsyscall ()
> >
> >
> >
> > This is exactly the last frame/packet of a large message I'm sending.
>
> Ok, I couldn't fix the problem yet, but looks like problem is both
> "channel->pending_messages" and "vortex_sequencer_queue" holding references
> to same "data", that is free'd in "4) step 3, data=...", right before the
> "goto start_resequence".
>
> I'm still looking how you free these, maybe it's free()d twice, maybe in
> the wrong order, I don't know :-(

Ok... narrowing the scope, it looks like this is the culprit to add it twice:

		if (next_seq_no >= max_seq_no) {
->			vortex_channel_queue_pending_message (channel, data);

			vortex_connection_unref (connection, "(vortex sequencer)");
			continue;
		}

It was already added by:

		if (! resequence) {
				vortex_channel_queue_pending_message (channel, data);
		}

now trying to figure out what to do :-(

-- 
Gustavo Sverzut Barbieri
------------------------
INdT, Recife, Brazil

Jabber: barbieri at gmail.com
   MSN: barbieri at gmail.com
  ICQ#: 17249123
 Skype: gsbarbieri
Mobile: +55 (81) 9927 0010
 Phone:  +1 (347) 624 6296; 08122692 at sip.stanaphone.com
   GPG: 0xB640E1A2 @ wwwkeys.pgp.net



More information about the Vortex mailing list