<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:windowtext;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Hello,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">I have some more debug information regarding this issue.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">To confirm the conjecture that the next two bytes may not always be immediately available,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">I made a local modification as shown below and ran the test again waiting for next failure.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Instead of immediately aborting the connection when this situation occurs, I poll the connection<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">for a short duration to see if the next two bytes appear subsequently, and then abort the connection anyway.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">} else if (msg->payload_size == 126) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> /* get extended 2 bytes length as unsigned 16 bit<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> unsigned integer */<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> bytes = __nopoll_conn_receive (conn, buffer + 2, 2);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> if (bytes != 2) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> int i;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> nopoll_log (conn->ctx, NOPOLL_LEVEL_CRITICAL, "Failed to get next 2 bytes to read header from the wire (read %d, %d), failed to received content, shutting down id=%d the connection", bytes, errno,
conn->id);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> for (i = 0; i < 100; i++) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> bytes = __nopoll_conn_receive (conn, buffer + 2, 2);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> if (bytes == 2) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> nopoll_log (conn->ctx, NOPOLL_LEVEL_CRITICAL, "Finally got next 2 bytes to read header from the wire after %d tries, shutting down anyway id=%d the connection", i, conn->id);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> break;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> }<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> nopoll_sleep(10); // 10 ms sleep<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> }<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> nopoll_msg_unref (msg);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> nopoll_conn_shutdown (conn);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> return NULL; <o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> } /* end if */<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">The above is just a local change to confirm the root cause.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">I reproduced the disconnection bug again and from the logs, you can see that only one byte was available initially<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">and after a very short duration we do get the remaining bytes.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">The relevant log lines look like:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">nopoll_conn.c:2798 Failed to get next 2 bytes to read header from the wire (read 1, 0), failed to received content, shutting down id=2 the connection
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">nopoll_conn.c:2802 Finally got next 2 bytes to read header from the wire after 5 tries, shutting down anyway id=2 the connection
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">I think, not getting next 2 (or 4 bytes) should not be treated as an error situation, but rather treated as<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">full message not yet available just like when waiting for the first two bytes of the message. The call is supposed to be<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">non-blocking so we of course cannot poll like the test code snippet above.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Do let me know if you need any further info.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">This issue is very critical to move forward with our current application.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Regards,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Rahul<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> nopoll-bounces@lists.aspl.es [mailto:nopoll-bounces@lists.aspl.es]
<b>On Behalf Of </b>Kale, Rahul<br>
<b>Sent:</b> Tuesday, April 19, 2016 6:33 PM<br>
<b>To:</b> nopoll@lists.aspl.es<br>
<b>Subject:</b> [noPoll] Websocket disconnect issues<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hello,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I have been trying to track down a random websocket disconnection issue that we are facing.<o:p></o:p></p>
<p class="MsoNormal">In our application, after some random number of hours of a good websocket connection that is
<o:p></o:p></p>
<p class="MsoNormal">transferring data back and forth, the connection suddenly breaks. We are using noPoll only<o:p></o:p></p>
<p class="MsoNormal">for the client side of a WebSocket connection. The server side is Apache frontend with
<o:p></o:p></p>
<p class="MsoNormal">node app backend.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">After eliminating all other probable causes, I enabled noPoll debug logs and was finally able
<o:p></o:p></p>
<p class="MsoNormal">to catch the error which causes a disconnect: <o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">nopoll_conn.c:2797 Failed to get next 2 bytes to read header from the wire, failed to received content, shutting down id=2 the connection
<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Reproducing this is difficult since it takes anything from a couple of hours to a couple of days<o:p></o:p></p>
<p class="MsoNormal">but eventually it happens. My debugging efforts at narrowing this down is thus taking quite a while.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">We are using secure web sockets (wss://). <o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Looking at the code, it seems that in the noPoll library, we expect the next two bytes of
<o:p></o:p></p>
<p class="MsoNormal">the websocket header to be always available if we read the first two bytes.
<o:p></o:p></p>
<p class="MsoNormal">I think that over TCP (and possibly even more so over TLS)<o:p></o:p></p>
<p class="MsoNormal">this may not always be guaranteed. The socket is non-blocking and it may occasionally<o:p></o:p></p>
<p class="MsoNormal">have a pattern that 4 bytes are split over two (or more) TCP segments.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">For reading first two bytes noPoll library seems to be buffering if data is not available.<o:p></o:p></p>
<p class="MsoNormal">For correctness, should this be done for subsequent parts of the header too?<o:p></o:p></p>
<p class="MsoNormal">Of course, I am not an expert here so could you analyze and let me know what is could<o:p></o:p></p>
<p class="MsoNormal">be the root cause? What should be my next steps?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I have already verified that this is not due to http(s) server error. I did a TCP dump<o:p></o:p></p>
<p class="MsoNormal">and confirmed that when this happens, the first FIN is sent by the client (noPoll) to the http server.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Regards,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Rahul<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Rahul Kale<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">IP Video Systems<o:p></o:p></p>
<p class="MsoNormal">Barco, Inc<o:p></o:p></p>
<p class="MsoNormal">1287 Anvilwood Ave<o:p></o:p></p>
<p class="MsoNormal">Sunnyvale, CA 94089<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Tel +1 408 400 4238<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Times New Roman","serif"">This message is subject to the following terms and conditions:
<a href="http://www.barco.com/en/maildisclaimer">MAIL DISCLAIMER</a> <o:p></o:p></span></p>
</div>
</div>
This message is subject to the following terms and conditions: <a href="http://www.barco.com/en/maildisclaimer">
MAIL DISCLAIMER</a>
</body>
</html>