Discussion:
Avoiding DNS cache sharing
Guillaume Quintard
2018-07-16 21:50:26 UTC
Permalink
Hello everyone,

I have an issue with the DNS cache of one easy handle polluting other
transfers and I was wonder if you could help me.

Here's some background:
- multithreaded application that broadcasts HTTPS request to multiple servers.
- so for each IP, I need to point libcurl to IP while still using the
same CURLOPT_URL, for everyone.
- I started with CURLOPT_RESOLVE but that doesn't work because even
with CURLOPT_DNS_CACHE_TIMEOUT set to 0, the data goes into the cache
and messes things up.
- I should use CURLOPT_CONNECT_TO, I think, but I'm targeting centos7
and it's not available on that platform.
- I'd really like to avoid switching to multi handles if possible.

My current attempt focuses on using a shared object per node, but that
is failing. The code is there: https://pastebin.com/Tz0JcFPS (this is
a justquick PoC cobbled together to test the shared object approach).

My question is: is the base idea of using the shared object plain
wrong, or should it work and I "only" botched the implementation? And
is there a better approach I missed?

Please let me now if I forgot anything important.

Best regards,
--
Guillaume Quintard
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette: https://curl.haxx.se/ma
Daniel Stenberg
2018-07-16 22:27:19 UTC
Permalink
Post by Guillaume Quintard
I have an issue with the DNS cache of one easy handle polluting other
transfers and I was wonder if you could help me.
You clearly wanted this email sent to the curl-library mailing list.
curl-users is primarily for issues and discussions regarding the command line
tool.

I'm CC'ing my reply to the curl-library list.
Post by Guillaume Quintard
- so for each IP, I need to point libcurl to IP while still using the
same CURLOPT_URL, for everyone.
- I started with CURLOPT_RESOLVE but that doesn't work because even
with CURLOPT_DNS_CACHE_TIMEOUT set to 0, the data goes into the cache
and messes things up.
CURLOPT_RESOLVE always puts entries into the DNS cache. That is its only
purpose!
Post by Guillaume Quintard
- I should use CURLOPT_CONNECT_TO, I think, but I'm targeting centos7
and it's not available on that platform.
Sure it is. If you just need to make sure you use a modern libcurl.
CURLOPT_CONNECT_TO was added in libcurl 7.49.0, released more than two years
ago.

(Using stock libs on centos of course dooms you to using outdated libraries, I
know.)
Post by Guillaume Quintard
- I'd really like to avoid switching to multi handles if possible.
My current attempt focuses on using a shared object per node, but that
is failing. The code is there: https://pastebin.com/Tz0JcFPS (this is
a justquick PoC cobbled together to test the shared object approach).
My question is: is the base idea of using the shared object plain
wrong, or should it work and I "only" botched the implementation? And
is there a better approach I missed?
I don't understand what you want and what doesn't work! You said "the data
goes into the cache and messes things up" and yet you said you explicitly want
to share the DNS cache between the handles?

When the CURLOPT_RESOLVE data goes into the cache, then if that cache is set
to be held by a share object (that shares DNS), that DNS cache is shared
between all easy handles that uses that same share object.
--
/ daniel.haxx.se
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette: https://curl.haxx.se/mail/etiquette.h
Loading...