<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:"Consolas","serif";}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Hello,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Thanks for the quick turnaround to provide the fix.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">However, I have a doubt about the code changes.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">As you can see from my previous post, when I made local changes to the code<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">to see how many bytes were actually captured during the failed situation, the logs show<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">that instead of next 2 bytes, only one byte is captured from the wire. From your code changes
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">I understand that only zero byte captured case (EWOULDBLOCK) is accounted for.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">I may be mistaken, but I think we need to handle the case more generically.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">I see from the code that for the first 2 bytes, and later for the handling the 4 bytes of mask data,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">you have already implemented correct buffering into 'pending_buf' for next attempt.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">I think we will need to have similar handling for this case (as well as for<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">next 8 bytes of extended header for payload_size == 127 case).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">I am working on integrating your current code changes into our app to see<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">what would happen when the situation hits. It takes quite a while to reproduce<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">the issue. I will let you know what I observe.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">In our application, I believe that the server side always sends the full header in one shot.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">However, the very nature of TCP is such that if a sender sends X bytes in one write/send call,
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">the receiver is not guaranteed to always get all X bytes in one read/recv call...
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">even if X is very small. The underlying OS networking system is free to break up the
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">streaming data at arbitrary boundaries. That is why the connection breaks very occasionally<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">and takes some time to reproduce.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Regards,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Rahul<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Francis Brosnan Blázquez [mailto:francis.brosnan@aspl.es]
<br>
<b>Sent:</b> Thursday, April 21, 2016 1:52 AM<br>
<b>To:</b> Kale, Rahul<br>
<b>Cc:</b> nopoll@lists.aspl.es<br>
<b>Subject:</b> Fixed -- Re: [noPoll] Websocket disconnect issues<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hello Rahul,<br>
<br>
I've updated noPoll sources to reproduce the problem you described (test_30)<br>
and added needed updates to make it work (svn revision 262):<br>
<br>
<a href="https://github.com/ASPLes/nopoll">https://github.com/ASPLes/nopoll</a><br>
<br>
Check them to see if they fixes your issue,<br>
Best Regards,<br>
<br>
<br>
<o:p></o:p></p>
<p class="MsoNormal">Hello Rahul,<br>
<br>
Thanks for reporting. I'm reviewing the issue. I'll let you know something,<br>
<br>
Best Regards,<br>
<br>
<br>
<o:p></o:p></p>
<p class="MsoNormal"> <br>
<br>
Hello,<br>
<br>
<br>
<br>
I have some more debug information regarding this issue.<br>
<br>
<br>
<br>
To confirm the conjecture that the next two bytes may not always be immediately available,<br>
<br>
I made a local modification as shown below and ran the test again waiting for next failure.
<br>
<br>
Instead of immediately aborting the connection when this situation occurs, I poll the connection<br>
<br>
for a short duration to see if the next two bytes appear subsequently, and then abort the connection anyway.<br>
<br>
<br>
<br>
} else if (msg->payload_size == 126) {<br>
<br>
/* get extended 2 bytes length as unsigned 16 bit<br>
<br>
unsigned integer */<br>
<br>
bytes = __nopoll_conn_receive (conn, buffer + 2, 2);<br>
<br>
if (bytes != 2) {<br>
<br>
int i;<br>
<br>
nopoll_log (conn->ctx, NOPOLL_LEVEL_CRITICAL, "Failed to get next 2 bytes to read header from the wire (read %d, %d), failed to received content, shutting down id=%d the connection", bytes, errno, conn->id);<br>
<br>
for (i = 0; i < 100; i++) {<br>
<br>
bytes = __nopoll_conn_receive (conn, buffer + 2, 2);<br>
<br>
if (bytes == 2) {<br>
<br>
nopoll_log (conn->ctx, NOPOLL_LEVEL_CRITICAL, "Finally got next 2 bytes to read header from the wire after %d tries, shutting down anyway id=%d the connection", i, conn->id);<br>
<br>
break;<br>
<br>
}<br>
<br>
nopoll_sleep(10); // 10 ms sleep<br>
<br>
}<br>
<br>
nopoll_msg_unref (msg);<br>
<br>
nopoll_conn_shutdown (conn);<br>
<br>
return NULL; <br>
<br>
} /* end if */<br>
<br>
<br>
<br>
The above is just a local change to confirm the root cause.<br>
<br>
<br>
<br>
I reproduced the disconnection bug again and from the logs, you can see that only one byte was available initially<br>
<br>
and after a very short duration we do get the remaining bytes.<br>
<br>
<br>
<br>
The relevant log lines look like:<br>
<br>
<br>
<br>
nopoll_conn.c:2798 Failed to get next 2 bytes to read header from the wire (read 1, 0), failed to received content, shutting down id=2 the connection
<br>
<br>
nopoll_conn.c:2802 Finally got next 2 bytes to read header from the wire after 5 tries, shutting down anyway id=2 the connection
<br>
<br>
<br>
<br>
I think, not getting next 2 (or 4 bytes) should not be treated as an error situation, but rather treated as<br>
<br>
full message not yet available just like when waiting for the first two bytes of the message. The call is supposed to be<br>
<br>
non-blocking so we of course cannot poll like the test code snippet above. <br>
<br>
<br>
<br>
Do let me know if you need any further info.<br>
<br>
This issue is very critical to move forward with our current application.<br>
<br>
<br>
<br>
Regards,<br>
<br>
<br>
<br>
Rahul<br>
<br>
<br>
<br>
<br>
<br>
<b>From:</b> <a href="mailto:nopoll-bounces@lists.aspl.es">nopoll-bounces@lists.aspl.es</a> [<a href="mailto:nopoll-bounces@lists.aspl.es">mailto:nopoll-bounces@lists.aspl.es</a>]
<b>On Behalf Of </b>Kale, Rahul<br>
<b>Sent:</b> Tuesday, April 19, 2016 6:33 PM<br>
<b>To:</b> <a href="mailto:nopoll@lists.aspl.es">nopoll@lists.aspl.es</a><br>
<b>Subject:</b> [noPoll] Websocket disconnect issues<br>
<br>
<br>
<br>
<br>
<br>
<br>
Hello,<br>
<br>
<br>
<br>
I have been trying to track down a random websocket disconnection issue that we are facing.<br>
<br>
In our application, after some random number of hours of a good websocket connection that is
<br>
<br>
transferring data back and forth, the connection suddenly breaks. We are using noPoll only<br>
<br>
for the client side of a WebSocket connection. The server side is Apache frontend with
<br>
<br>
node app backend.<br>
<br>
<br>
<br>
After eliminating all other probable causes, I enabled noPoll debug logs and was finally able
<br>
<br>
to catch the error which causes a disconnect: <br>
<br>
<br>
<br>
nopoll_conn.c:2797 Failed to get next 2 bytes to read header from the wire, failed to received content, shutting down id=2 the connection
<br>
<br>
<br>
<br>
Reproducing this is difficult since it takes anything from a couple of hours to a couple of days<br>
<br>
but eventually it happens. My debugging efforts at narrowing this down is thus taking quite a while.<br>
<br>
<br>
<br>
We are using secure web sockets (wss://). <br>
<br>
<br>
<br>
Looking at the code, it seems that in the noPoll library, we expect the next two bytes of
<br>
<br>
the websocket header to be always available if we read the first two bytes. <br>
<br>
I think that over TCP (and possibly even more so over TLS)<br>
<br>
this may not always be guaranteed. The socket is non-blocking and it may occasionally<br>
<br>
have a pattern that 4 bytes are split over two (or more) TCP segments.<br>
<br>
<br>
<br>
For reading first two bytes noPoll library seems to be buffering if data is not available.<br>
<br>
For correctness, should this be done for subsequent parts of the header too?<br>
<br>
Of course, I am not an expert here so could you analyze and let me know what is could<br>
<br>
be the root cause? What should be my next steps?<br>
<br>
<br>
<br>
I have already verified that this is not due to http(s) server error. I did a TCP dump<br>
<br>
and confirmed that when this happens, the first FIN is sent by the client (noPoll) to the http server.<br>
<br>
<br>
<br>
<br>
<br>
Regards,<br>
<br>
<br>
<br>
Rahul<br>
<br>
<br>
<br>
Rahul Kale<br>
<br>
<br>
<br>
IP Video Systems<br>
<br>
Barco, Inc<br>
<br>
1287 Anvilwood Ave<br>
<br>
Sunnyvale, CA 94089<br>
<br>
<br>
<br>
Tel +1 408 400 4238<br>
<br>
<br>
<br>
This message is subject to the following terms and conditions: <a href="http://www.barco.com/en/maildisclaimer">
MAIL DISCLAIMER</a> <br>
<br>
<br>
This message is subject to the following terms and conditions: <a href="http://www.barco.com/en/maildisclaimer">
MAIL DISCLAIMER</a><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0" width="100%" style="width:100.0%">
<tbody>
<tr>
<td style="padding:0in 0in 0in 0in">
<pre><o:p> </o:p></pre>
<pre>_______________________________________________<o:p></o:p></pre>
<pre>noPoll mailing list<o:p></o:p></pre>
<pre><a href="mailto:noPoll@lists.aspl.es">noPoll@lists.aspl.es</a><o:p></o:p></pre>
<pre><a href="http://lists.aspl.es/cgi-bin/mailman/listinfo/nopoll">http://lists.aspl.es/cgi-bin/mailman/listinfo/nopoll</a><o:p></o:p></pre>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal"><span style="display:none"><o:p> </o:p></span></p>
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0" width="100%" style="width:100.0%">
<tbody>
<tr>
<td style="padding:0in 0in 0in 0in">
<pre><o:p> </o:p></pre>
<pre>-- <o:p></o:p></pre>
<pre>Francis Brosnan Blázquez - ASPL<o:p></o:p></pre>
<pre><a href="http://www.asplhosting.com/">http://www.asplhosting.com/</a><o:p></o:p></pre>
<pre><a href="http://www.aspl.es/">http://www.aspl.es/</a><o:p></o:p></pre>
<pre><a href="https://twitter.com/aspl_es">https://twitter.com/aspl_es</a><o:p></o:p></pre>
<pre><a href="https://twitter.com/asplhosting">https://twitter.com/asplhosting</a><o:p></o:p></pre>
<pre><a href="https://twitter.com/francisbrosnanb">https://twitter.com/francisbrosnanb</a><o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>91 134 14 22 - 91 134 14 45 - 91 116 07 57<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>AVISO LEGAL<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<pre>En virtud de lo dispuesto en la Ley Orgánica 15/1999, de 13 de<o:p></o:p></pre>
<pre>diciembre, de Protección de Datos de Carácter Personal, le informamos de<o:p></o:p></pre>
<pre>que sus datos de carácter personal, recogidos de fuentes accesibles al<o:p></o:p></pre>
<pre>público o datos que usted nos ha facilitado previamente, proceden de<o:p></o:p></pre>
<pre>bases de datos propiedad de Advanced Software Production Line, S.L.<o:p></o:p></pre>
<pre>(ASPL).<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<pre>ASPL garantiza que los datos serán tratados con la finalidad de mantener<o:p></o:p></pre>
<pre>las oportunas relaciones comerciales o promocionales con usted o la<o:p></o:p></pre>
<pre>entidad que usted representa. No obstante, usted puede ejercitar sus<o:p></o:p></pre>
<pre>derechos de acceso, rectificación, cancelación y oposición dispuestos en<o:p></o:p></pre>
<pre>la mencionada Ley Orgánica, notificándolo por escrito a ASPL -<o:p></o:p></pre>
<pre>Protección Datos, C/Antonio Suárez 10 A-102, 28802, Alcalá de Henares<o:p></o:p></pre>
<pre>(Madrid).<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
This message is subject to the following terms and conditions: <a href="http://www.barco.com/en/maildisclaimer">
MAIL DISCLAIMER</a>
</body>
</html>