Persistent http connection

You need to provide a file for stroing cookies.

I can do it from the module 'curl' installed by 'luarocks install lua-curl'.

local session = curl.easy()
local cookies
if curl.OPT_COOKIEFILE and curl.OPT_COOKIEJAR then
cookies = os.tmpname()
session:setopt(curl.OPT_COOKIEFILE,cookies)
session:setopt(curl.OPT_COOKIEJAR,cookies)
logins[session] = cookies
else
message("Your 'curl' does not support cookies. You will be anonymous.")
end

The module 'http' should offer a similar service.
Op Wo., 7 Nov. 2018 om 21:34 het Srinivas Murthy

Post by Srinivas Murthy
Using http.request() like so,
response_body = { }
local res, code, response_headers, status = http.request
{
url = myurl,
method = "POST",
is resulting in a new TCP conn each time I invoke this call. Is there a way to use persistent tcp conn to avoid this conn setup / tear down overhead?

Coda Highland

2018-11-07 20:44:21 UTC

Post by Dirk Laurie
You need to provide a file for stroing cookies.
I can do it from the module 'curl' installed by 'luarocks install lua-curl'.
local session = curl.easy()
local cookies
if curl.OPT_COOKIEFILE and curl.OPT_COOKIEJAR then
cookies = os.tmpname()
session:setopt(curl.OPT_COOKIEFILE,cookies)
session:setopt(curl.OPT_COOKIEJAR,cookies)
logins[session] = cookies
else
message("Your 'curl' does not support cookies. You will be anonymous.")
end
The module 'http' should offer a similar service.

That's curious. A cookie jar is required in order to use HTTP
keepalive? Is that documented? It seems like a dubious dependency.

/s/ Adam

Srinivas Murthy

2018-11-08 00:46:51 UTC

This is an important issue for me. Can anyone please add to this? There's
got to be a way to avoid conn setup/teardown each time we issue a
http.request()

Thanks

Post by Dirk Laurie

Post by Dirk Laurie
You need to provide a file for stroing cookies.
I can do it from the module 'curl' installed by 'luarocks install

lua-curl'.

Post by Dirk Laurie
local session = curl.easy()
local cookies
if curl.OPT_COOKIEFILE and curl.OPT_COOKIEJAR then
cookies = os.tmpname()
session:setopt(curl.OPT_COOKIEFILE,cookies)
session:setopt(curl.OPT_COOKIEJAR,cookies)
logins[session] = cookies
else
message("Your 'curl' does not support cookies. You will be

anonymous.")

Post by Dirk Laurie
end
The module 'http' should offer a similar service.

That's curious. A cookie jar is required in order to use HTTP
keepalive? Is that documented? It seems like a dubious dependency.
/s/ Adam

Dirk Laurie

2018-11-08 07:49:20 UTC

Op Do., 8 Nov. 2018 om 02:47 het Srinivas Murthy

This is an important issue for me. Can anyone please add to this? There's got to be a way to avoid conn setup/teardown each time we issue a http.request()
Thanks

That's curious. A cookie jar is required in order to use HTTP
keepalive? Is that documented? It seems like a dubious dependency.

All I can attest to is that curl with a cookie jar works. It might
well be overkill of you need only keepalive, but since I use curl only
for sites that require me to login, I would not know.

Philippe Verdy

2018-11-08 12:46:30 UTC

A cookie jar to store cookies is for something else: it does not create
"keepalive" HTTP sessions, but allows restarting sessions that have been
already closed by proving again the stored cookies.

No, there's NO need at all of ANY cookie jar storage in HTTP to use
"keepalive" sessions.

So this just looks like a limit of the current "http.*" package you use
which terminates all sessions immediately after each request instead of
maintaining it (note: maintaing a session is not warranties: the server may
close it at any time if your session is idle for too long, or for various
reasons, but for HTTP sessions you want to use for example to performed
streamed requests in a queue, reusing the session will always be faster
than creating one new outgoing session for each request in your queue
(including the cookies exchange which also add to the data volume
transmitted).

Without "keepalive", if you want to perform a lot of small queries in a
long series of requests, or want to do streaming, your client would rapidly
exhaust its number of available TCP ports (because each new HTTP session
allocates a new port number, and even if it is closed and the server has
acknowledge that closure, that port number cannot be reused immediately
before a delay which is generally about 30 seconds (this varies if you have
proxies, or depending on your ISP or router, or depending on security
thresholds: for security and privacy of TCP, this delay should never be
reduced too much: there are security routers that force outgoing TCP port
numbers to remain in FIN_WAIT state for more than 3 minutes, and if your
Internet connection is shared by multiple users or other services, you'll
fall to a case where you no longer have any usable local TCP ports to allow
more connection, and your repeated HTTP queries will rapidly then fail to
connect after a few thousands queries performed rapidly).

The "keepalive" feature was made a *standard* part of HTTP/1.1 (instead of
being optional and not working very well in the now very old HTTP/1.0
protocol) exactly to preserve network resources, and get much better
performance and lower bandwidth usage. Without it, most modern websites
would be extremely slow.

Using "curl" to create a cookie jar does not mean that you are using
keepalive, but at least it allows all requests you send using the same
cookie jar file to reopen new sessions using the same resident cookies that
were exchanged in the previous requests (so this "simulates" sessions, but
still you'll see many outgoing TCP sockets that are "half-terminated" in
FIN_WAIT state and that still block the unique port numbers that were
assigned to that socket).

So I suggest instead using a virtual "local HTTP proxy" package that will
maintain a session opened on an unterminated HTTP session, overwhich you'll
queue one or multiple http requests. Such thing is used in (de)muxing
subprotocols (of multimedia streaming protocols, or in VPN protocol
managers), but it is even part of classic web browsers that queue the many
queries that are performed in the background when visiting a website
(generally a browser can create up to 4 HTTP sessions working in parallel,
each one having its own queue of queries). All of these use the keepalive
feature of HTTP/1.1.

Post by Dirk Laurie
Op Do., 8 Nov. 2018 om 02:47 het Srinivas Murthy

Post by Srinivas Murthy
This is an important issue for me. Can anyone please add to this?

There's got to be a way to avoid conn setup/teardown each time we issue a
http.request()

Post by Srinivas Murthy
Thanks

Post by Dirk Laurie
You need to provide a file for stroing cookies.
I can do it from the module 'curl' installed by 'luarocks install

lua-curl'.

anonymous.")

Post by Dirk Laurie
end
The module 'http' should offer a similar service.

That's curious. A cookie jar is required in order to use HTTP
keepalive? Is that documented? It seems like a dubious dependency.

All I can attest to is that curl with a cookie jar works. It might
well be overkill of you need only keepalive, but since I use curl only
for sites that require me to login, I would not know.

Dirk Laurie

2018-11-08 13:05:24 UTC

...

Using "curl" to create a cookie jar does not mean that you are using keepalive, but at least it allows all requests you send using the same cookie jar file to reopen new sessions using the same resident cookies that were exchanged in the previous requests (so this "simulates" sessions, but still you'll see many outgoing TCP sockets that are "half-terminated" in FIN_WAIT state and that still block the unique port numbers that were assigned to that socket).

Thanks a lot for this explanation.

-- Dirk

Philippe Verdy

2018-11-09 00:10:13 UTC

What would be needed is that the "http" package included in the created
instance an optional boolean "keepalive" property which (once set to true)
woulc cause it to NOT close the session automatically once the result has
arrived, but maintains the session open, and that it includes a "close()"
method that your application can use when he non longer needs the session.
That package should also include a "isalive()" method to check if the
session is still opened (in "idle" state or "running" state waiting for
replies).

That package could also handle a "request queue", to allow sending multiple
queries in order (each query could be associated with some user data, so
that when you get the results, you can identify to which query the results
is coming. Ideally, the userdata should be the "query" itself which has its
own state (possibly the resource name or query string, the verb like
"GET"/"POST", other user data you can use for example to hold a persistent
cookie jar or other persistant data needed by your aplpication; when you
receive a reply (success or failure) you also get the reference to that
userdata, and the state of the HTTP session.

Note: an HTTP session is not necessarily bound to a TCP session (it could
be any bidirectional I/O channel): this binding is only done when you
connect it (by opening a socket with the target indicated by the host and
port number, i.e. the first part of the URL, all what is before first "/"
or "?" or "#" of the full URL, but then excluding all what is after the
first "#" which is an anchor): the HTTP protocol itself does not
"understand" the domain name, host address, port number, and anchor, it
only needs the query.

Note as well that an HTTP query is not limited to send only a single verb
(like "GET", "POST", "PUT", "HEAD"...) and a resource name (usually
starting with a "/"): it sends also a set of MIME headers, and you can also
"stream" an output attachment which may be arbitrarily long: if you want
the query to be handled asynchronously, you'll need that the async event
handler can check the status of the session (idle, sending headers, sending
an attachment, waiting for a reply from remote server, and notification
that a reply starts to come, i.e. you've received at least HTTP response
with a status, and then if the response received is complete and not just
some of the MIME headers, and includes the full content body).

Normally it's up to the "http" package to handle itself and internally some
responses like intermediate status while server confirms it has received a
query and starts running but cannot give a definitive status to your query,
or if the server replies with a redirect (it's up to the client to accept
the redirect live "moved to", and if it accepts it, reexecute the query but
for a new target).

The package should also include internally the support of "streamed" format
for partial responses for the attachment, and encode/decode it for you), it
should support also itself the negociation of options like data
compression, encryption/decryption, and should handle itself the cookie jar
or provide an API so that your client is notified when the server sends you
a cookie that your client will store or discard as he wants.

If the "http" package works only in synchronous mode, then all queries are
blocking, but then you cannot handle a queue of requests (so you don't need
at all any private data: the query will be terminated, but this is very
limitating because it does not allow sending large queries or receive large
responses (either in MIME headers, or in the content body). Running
asynchronously allows much better management, but doe not necessarily
implies multithreading, and the Lua coroutines (with yield and resume) can
be used to handle the state of a session in a "semi-blocking" way, just
like I/O on files. Effectively the HTTP protocol is just using the generic
"producer/consumer" concept over a single bidirectional I/O channel (it's
not up to HTTP itself to open that channel and negociate the options, not
even the HTTPS securisation, and it is agnostic about the transport
protocol used, which may be TCP, or a TCP-like VPN over UDP, or a serial
link, HTTP as well will not resolve itself DNS hostnames, needed before you
can open an outgoing socket; and the protocol itself does not need that you
initiated yourself the session before sending a query: that bidirectional
I/O channel may have be initiated by the remote agent: HTTP queries are
asymetric with a server and a client, but the asymetry is not necessarily
the same for the bidirectional I/O channel on which it is established, so
the same channel, once it's established and both agents are waiting for a
query to execcute, may be used by one or the other agent to start a query
in which case it will be a HTTP client and the other agent will be a HTTP
server replying to the query, but the roles can be swapped later and it's
up to each agent to decide when he wants to terminate/close the I/O channel
otself, or to indicate to the other agent that he should terminate/close
the session once it has processed the query or the response, and it is the
role of the "keepalive" option in MIME headers; as well HTTP allows any of
the two agent to terminate the session: this is indicated by an I/O error
or close event deteted by the agent that was waiting the other party. HTTP
has well does not provide itself the facility to close the I/O channel
itself: the HTTP protocol is by default illimited in time).

Post by Philippe Verdy

Post by Philippe Verdy
A cookie jar to store cookies is for something else: it does not create

"keepalive" HTTP sessions, but allows restarting sessions that have been
already closed by proving again the stored cookies.

Post by Philippe Verdy
No, there's NO need at all of ANY cookie jar storage in HTTP to use

"keepalive" sessions.
...

Post by Philippe Verdy
Using "curl" to create a cookie jar does not mean that you are using

keepalive, but at least it allows all requests you send using the same
cookie jar file to reopen new sessions using the same resident cookies that
were exchanged in the previous requests (so this "simulates" sessions, but
still you'll see many outgoing TCP sockets that are "half-terminated" in
FIN_WAIT state and that still block the unique port numbers that were
assigned to that socket).
Thanks a lot for this explanation.
-- Dirk

Srinivas Murthy

2018-11-08 17:36:20 UTC

"So I suggest instead using a virtual "local HTTP proxy" package that will
maintain a session opened on an unterminated HTTP session, overwhich you'll
queue one or multiple http requests. "
Is anyone aware of a lua http lib that supports keepalive? Using a "local
HTTP proxy" will still be a significant overhead if the client still has to
setup a new conn for each req. These are streaming events and could be very
frequent.

Post by Philippe Verdy
A cookie jar to store cookies is for something else: it does not create
"keepalive" HTTP sessions, but allows restarting sessions that have been
already closed by proving again the stored cookies.
No, there's NO need at all of ANY cookie jar storage in HTTP to use
"keepalive" sessions.
So this just looks like a limit of the current "http.*" package you use
which terminates all sessions immediately after each request instead of
maintaining it (note: maintaing a session is not warranties: the server may
close it at any time if your session is idle for too long, or for various
reasons, but for HTTP sessions you want to use for example to performed
streamed requests in a queue, reusing the session will always be faster
than creating one new outgoing session for each request in your queue
(including the cookies exchange which also add to the data volume
transmitted).
Without "keepalive", if you want to perform a lot of small queries in a
long series of requests, or want to do streaming, your client would rapidly
exhaust its number of available TCP ports (because each new HTTP session
allocates a new port number, and even if it is closed and the server has
acknowledge that closure, that port number cannot be reused immediately
before a delay which is generally about 30 seconds (this varies if you have
proxies, or depending on your ISP or router, or depending on security
thresholds: for security and privacy of TCP, this delay should never be
reduced too much: there are security routers that force outgoing TCP port
numbers to remain in FIN_WAIT state for more than 3 minutes, and if your
Internet connection is shared by multiple users or other services, you'll
fall to a case where you no longer have any usable local TCP ports to allow
more connection, and your repeated HTTP queries will rapidly then fail to
connect after a few thousands queries performed rapidly).
The "keepalive" feature was made a *standard* part of HTTP/1.1 (instead of
being optional and not working very well in the now very old HTTP/1.0
protocol) exactly to preserve network resources, and get much better
performance and lower bandwidth usage. Without it, most modern websites
would be extremely slow.
Using "curl" to create a cookie jar does not mean that you are using
keepalive, but at least it allows all requests you send using the same
cookie jar file to reopen new sessions using the same resident cookies that
were exchanged in the previous requests (so this "simulates" sessions, but
still you'll see many outgoing TCP sockets that are "half-terminated" in
FIN_WAIT state and that still block the unique port numbers that were
assigned to that socket).
So I suggest instead using a virtual "local HTTP proxy" package that will
maintain a session opened on an unterminated HTTP session, overwhich you'll
queue one or multiple http requests. Such thing is used in (de)muxing
subprotocols (of multimedia streaming protocols, or in VPN protocol
managers), but it is even part of classic web browsers that queue the many
queries that are performed in the background when visiting a website
(generally a browser can create up to 4 HTTP sessions working in parallel,
each one having its own queue of queries). All of these use the keepalive
feature of HTTP/1.1.

Post by Dirk Laurie
Op Do., 8 Nov. 2018 om 02:47 het Srinivas Murthy

Post by Srinivas Murthy
This is an important issue for me. Can anyone please add to this?

There's got to be a way to avoid conn setup/teardown each time we issue a
http.request()

Post by Srinivas Murthy
Thanks

Post by Dirk Laurie
You need to provide a file for stroing cookies.
I can do it from the module 'curl' installed by 'luarocks install

lua-curl'.

anonymous.")

Post by Dirk Laurie
end
The module 'http' should offer a similar service.

That's curious. A cookie jar is required in order to use HTTP
keepalive? Is that documented? It seems like a dubious dependency.

All I can attest to is that curl with a cookie jar works. It might
well be overkill of you need only keepalive, but since I use curl only
for sites that require me to login, I would not know.

Tim McCracken

2018-11-08 18:04:58 UTC

Post by Dirk Laurie
All I can attest to is that curl with a cookie jar works. It might
well be overkill of you need only keepalive, but since I use curl only
for sites that require me to login, I would not know.

It sounds like you are confusing session state (keeping a session alive) using cookies versus HTTP keep-alive which are two very different things. HTTP keep-alive has nothing to do with being logged in or not. It only affects requests that occur over a very short time span as a previous person posted. For example, if you open a web page with 100 photos on it, each of those 100 photos is a different HTTP request. Keep-alive will enable all those to be downloaded using a single TCP session. But a few seconds after all the pages are down loaded, that TCP session goes away. But the presence or absence of cookies has nothing to do with this. And in fact, session state will work just fine without HTTP keep alive - and can span days, months or years depending on how the cookies are set to expire.

For the original poster: cURL c++ library supports keep-alive - I think they simply refer to it as TCP keep alive. I would be very surprised if the Lua thin wrapper libraries didn't allow support of it as well. You may just need to use a cURL based library if that will otherwise support your need

Daurnimator

2018-11-09 02:05:18 UTC

On Fri, 9 Nov 2018 at 04:36, Srinivas Murthy

Is anyone aware of a lua http lib that supports keepalive? Using a "local HTTP proxy" will still be a significant overhead if the client still has to setup a new conn for each req. These are streaming events and could be very frequent.

lua-http is gaining support for it soon
https://github.com/daurnimator/lua-http/pull/121
It's a much trickier problem than you may think at the surface!
especially once SSL is involved (and infact I have found bugs in
nginx's and curl's implementations while doing research for
lua-http's)

Srinivas Murthy

2018-11-09 18:01:49 UTC

Appreciate all the discussion. I have experience with this before and yes
its not simple to do it properly.
For now though, I'm in a tight time frame and need a simple solution that
works with non - https solution. The closest I see is the curl wrapper that
is mentioned. Any other ideas?

Post by Daurnimator
On Fri, 9 Nov 2018 at 04:36, Srinivas Murthy

Post by Srinivas Murthy
Is anyone aware of a lua http lib that supports keepalive? Using a

"local HTTP proxy" will still be a significant overhead if the client still
has to setup a new conn for each req. These are streaming events and could
be very frequent.
lua-http is gaining support for it soon
https://github.com/daurnimator/lua-http/pull/121
It's a much trickier problem than you may think at the surface!
especially once SSL is involved (and infact I have found bugs in
nginx's and curl's implementations while doing research for
lua-http's)

Philippe Verdy

2018-11-09 20:57:30 UTC

You may want to look at LuaSocket to implement your own wrapper for
persistent TCP sessions. On which you'll reimplement the HTTP protocol.
https://luarocks.org/modules/luarocks/luasocket
https://github.com/diegonehab/luasocket
The bad thing is that it's not easily integrable on basic Lua servers
without the native C support library (the biggest difficulty being that to
fully implement persistent sessions and streaming, the simple
consumer/receiver pattern for I/O used by Lua (which is synchronous and
based on coroutines whose exectution is controled by blocking yield/resume
calls) will not easily allow you to react on asynchronous send/receive
events that a streaming protocol usually requires (as well you'll need to
support true multithreading and nnot just cooperative threading).

Given these limits, the "http" package offers no real "resume" facility,
and as the socket it creates is temporary and can be garbage collected (and
then the FIN_WAIT state of the socket starting for a long time, forbidding
reuse of the dynamically assigned TCP port numbert) as soon as it
terminates a request, it is not easy to avoid its closure. So each query
opens its own new separate socket and you'll "burn" a lot of outgoing local
TCP ports if you intend to use it for streaming many small HTTP requests.

A solution/emulation however is possible (just like with old versions of
WinSockets, in old cooperative-only versions of 16-bit Windows, also based
on yields/resume with an message loop, which can easily adapted to pure-Lua
coroutines, already used by the basic I/O library of Lua), provided that
your Lua application is cooperative (and provides enough "yield" calls to
serve both the "send" and "receive" events and manage the two message
queues on that socket: one queue for outgoing HTTP requests, the other
queue for the incoming responses). A smart implementation of HTTP would use
not just a single pair of queue (one TCP session), but could create at
least 4 pairs of queues per destination (i.e. host and port number, where a
host is either a domain name or an IPv4 or IPv6 address). With that you
would emulate what web browsers already do to load pages and multiple
dependant scripts and images, without abusing remote server resources.

Note that the effect of absence of persistence of TCP sessions does not
concern only the local host (for local outgoing TCP port numbers allocated
to each new socket created by the client), but also the remove server (for
local incoming TCP port numbers allocated to each accepted incoming
requests): the exhaustion of port numbers on server may be even more
critical, servers also needing to keep the FIN_WAIT delays if they want to
secure their communications and avoid sending private data to other new
incoming clients, or to avoid that incoming data coming to late from a
previously connected client comes in to pollute the incoming data from new
connections !

All HTTP clients and servers today need to support "keepalive" as described
in HTTP/1.1. The old behavior without them (in HTTP/1.0) is no longer
acceptable and cause severe security problems (notably it exposes servers
to DOS attacks if they permit the same remote client to use an arbitrary
number of new temporary incoming queries).

Post by Srinivas Murthy
Appreciate all the discussion. I have experience with this before and yes
its not simple to do it properly.
For now though, I'm in a tight time frame and need a simple solution that
works with non - https solution. The closest I see is the curl wrapper that
is mentioned. Any other ideas?

Post by Daurnimator
On Fri, 9 Nov 2018 at 04:36, Srinivas Murthy

Post by Srinivas Murthy
Is anyone aware of a lua http lib that supports keepalive? Using a

Philippe Verdy

2018-11-09 21:16:52 UTC

Post by Philippe Verdy
You may want to look at LuaSocket to implement your own wrapper for
persistent TCP sessions. On which you'll reimplement the HTTP protocol.
https://luarocks.org/modules/luarocks/luasocket
https://github.com/diegonehab/luasocket
The bad thing is that it's not easily integrable on basic Lua servers
without the native C support library (the biggest difficulty being that to
fully implement persistent sessions and streaming, the simple
consumer/receiver pattern for I/O used by Lua (which is synchronous and
based on coroutines whose exectution is controled by blocking yield/resume
calls) will not easily allow you to react on asynchronous send/receive
events that a streaming protocol usually requires (as well you'll need to
support true multithreading and nnot just cooperative threading).
Given these limits, the "http" package offers no real "resume" facility,
and as the socket it creates is temporary and can be garbage collected (and
then the FIN_WAIT state of the socket starting for a long time, forbidding
reuse of the dynamically assigned TCP port numbert) as soon as it
terminates a request, it is not easy to avoid its closure. So each query
opens its own new separate socket and you'll "burn" a lot of outgoing local
TCP ports if you intend to use it for streaming many small HTTP requests.
A solution/emulation however is possible (just like with old versions of
WinSockets, in old cooperative-only versions of 16-bit Windows, also based
on yields/resume with an message loop, which can easily adapted to pure-Lua
coroutines, already used by the basic I/O library of Lua), provided that
your Lua application is cooperative (and provides enough "yield" calls to
serve both the "send" and "receive" events and manage the two message
queues on that socket: one queue for outgoing HTTP requests, the other
queue for the incoming responses). A smart implementation of HTTP would use
not just a single pair of queue (one TCP session), but could create at
least 4 pairs of queues per destination (i.e. host and port number, where a
host is either a domain name or an IPv4 or IPv6 address). With that you
would emulate what web browsers already do to load pages and multiple
dependant scripts and images, without abusing remote server resources.
Note that the effect of absence of persistence of TCP sessions does not
concern only the local host (for local outgoing TCP port numbers allocated
to each new socket created by the client), but also the remove server (for
local incoming TCP port numbers allocated to each accepted incoming
requests): the exhaustion of port numbers on server may be even more
critical, servers also needing to keep the FIN_WAIT delays if they want to
secure their communications and avoid sending private data to other new
incoming clients, or to avoid that incoming data coming to late from a
previously connected client comes in to pollute the incoming data from new
connections !
All HTTP clients and servers today need to support "keepalive" as
described in HTTP/1.1. The old behavior without them (in HTTP/1.0) is no
longer acceptable and cause severe security problems (notably it exposes
servers to DOS attacks if they permit the same remote client to use an
arbitrary number of new temporary incoming queries).
Le ven. 9 nov. 2018 Ã 19:02, Srinivas Murthy <

Post by Daurnimator
On Fri, 9 Nov 2018 at 04:36, Srinivas Murthy

Post by Srinivas Murthy
Is anyone aware of a lua http lib that supports keepalive? Using a

Sam Chang

2018-11-12 02:19:28 UTC