Discussion:
[tex-live] Still troubles when trying to reach some pages of tug/texlive site
Denis Bitouzé
2018-02-12 06:53:53 UTC
Permalink
Hi,

I still have troubles when trying to reach some pages of texlive
site. I can always reach:
┌────
│ http://www.tug.org/svn/texlive/trunk/Master/
└────
but, most of the times, not for example:

┌────
│ $ wget https://www.tug.org/texlive/lists.html
│ --2018-02-12 07:35:05-- https://www.tug.org/texlive/lists.html
│ Resolving www.tug.org… 91.121.174.77
│ Connecting to www.tug.org|91.121.174.77|:443… failed: Connection refused
└────
or
┌────
│ $ wget http://www.tug.org/svn/texlive\?view\=revision\&revision\=46603
│ --2018-02-12 07:42:45-- http://www.tug.org/svn/texlive?view=revision&revision=46603
│ Resolving www.tug.org… 91.121.174.77
│ Connecting to www.tug.org|91.121.174.77|:80… failed: Connection refused
└────

Sometimes, I can temporarily reach pages such as
┌────
│ http://www.tug.org/svn/texlive?view=revision&revision=46603
└────
but not:
┌────
│ http://www.tug.org/svn/texlive/trunk/Master/texmf-dist/tex/generic/tex4ht/tex4ht.sty?r1=46603&r2=46602&pathrev=46603
└────

Very frustrating indeed...
--
Denis
Norbert Preining
2018-02-12 07:32:28 UTC
Permalink
Post by Denis Bitouzé
Sometimes, I can temporarily reach pages such as
There is a block installed on tug that should catch robots. If you are
hitting to hard and quickly on the server/svn space, you will be blocked
for some time.

Norbert

--
PREINING Norbert http://www.preining.info
Accelia Inc. + JAIST + TeX Live + Debian Developer
GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
Denis Bitouzé
2018-02-12 07:42:20 UTC
Permalink
Post by Norbert Preining
Post by Denis Bitouzé
Sometimes, I can temporarily reach pages such as
There is a block installed on tug that should catch robots. If you are
hitting to hard and quickly on the server/svn space, you will be
blocked for some time.
Well, my excursion:

1. http://www.tug.org/svn/texlive/trunk/Master/ (okay)
2. http://www.tug.org/svn/texlive/trunk/Master/?pathrev=46602 (okay)
3. http://www.tug.org/svn/texlive?view=revision&revision=46602 (blocked)

is made at a reasonable, normal, speed.

Isn't the block installed too sensitive? Maybe the allowed delay between
two clicks could be relaxed a bit...
--
Denis
Norbert Preining
2018-02-12 07:53:55 UTC
Permalink
Post by Denis Bitouzé
is made at a reasonable, normal, speed.
Then it should work. I got hit by clicking through the svn several
times, but then we lowered the bar and it should be fine now, especially
when you only hit three locations.

Karl should check the web server log ...

Norbert

--
PREINING Norbert http://www.preining.info
Accelia Inc. + JAIST + TeX Live + Debian Developer
GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
Denis Bitouzé
2018-02-12 07:56:11 UTC
Permalink
Post by Norbert Preining
Post by Denis Bitouzé
is made at a reasonable, normal, speed.
Then it should work. I got hit by clicking through the svn several
times, but then we lowered the bar and it should be fine now, especially
when you only hit three locations.
Sigh... This time I even cannot reach:

┌────
│ http://www.tug.org/svn/texlive/trunk/Master/
└────
Post by Norbert Preining
Karl should check the web server log ...
Yes, please.
--
Denis
Mojca Miklavec
2018-02-12 09:01:24 UTC
Permalink
Post by Norbert Preining
Post by Denis Bitouzé
is made at a reasonable, normal, speed.
Then it should work. I got hit by clicking through the svn several
times, but then we lowered the bar and it should be fine now, especially
when you only hit three locations.
Karl should check the web server log ...
I cannot recall the time, but I was also blacklisted recently after
merely browsing the subversion repository with what I consider pretty
normal speed (and I'm more on the slow spectrum compared to other
human beings :).

(Given the super lengthy paths in TeX Live, it's relatively easy to
click 10 times just to get to the desired folder, so I won't claim
that I only clicked three times.)

Pretty annoying in addition to the fact that some folders have
additional problems with too many subfolders and are somewhat
problematic to browse, and that some files are blacklisted for viewing
anyway ... but I'll probably start using your (Norbert's) git
repository once I get annoyed enough :)

Mojca
Norbert Preining
2018-02-12 09:29:21 UTC
Permalink
Post by Mojca Miklavec
problematic to browse, and that some files are blacklisted for viewing
https://git.texlive.info/texlive/tree

very fast ;-)

But I will probably move to gitea at some point (lightweight gitlab).

Norbert

--
PREINING Norbert http://www.preining.info
Accelia Inc. + JAIST + TeX Live + Debian Developer
GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
Enrico Gregorio
2018-02-12 10:26:57 UTC
Permalink
Post by Denis Bitouzé
Post by Norbert Preining
Post by Denis Bitouzé
Sometimes, I can temporarily reach pages such as
There is a block installed on tug that should catch robots. If you are
hitting to hard and quickly on the server/svn space, you will be
blocked for some time.
1. http://www.tug.org/svn/texlive/trunk/Master/ (okay)
2. http://www.tug.org/svn/texlive/trunk/Master/?pathrev=46602 (okay)
3. http://www.tug.org/svn/texlive?view=revision&revision=46602 (blocked)
is made at a reasonable, normal, speed.
Isn't the block installed too sensitive? Maybe the allowed delay between
two clicks could be relaxed a bit...
--
Denis
Happens regularly also to me. Waiting between one click and the next is
irrelevant: at the third link, access is blocked.

Ciao
Enrico
Denis Bitouzé
2018-02-12 10:42:16 UTC
Permalink
Post by Enrico Gregorio
Happens regularly also to me. Waiting between one click and the next is
irrelevant: at the third link, access is blocked.
Nice to not be the only one with this problem :)

Ciao.
--
Denis
Siep Kroonenberg
2018-02-12 12:18:43 UTC
Permalink
Post by Denis Bitouzé
Post by Enrico Gregorio
Happens regularly also to me. Waiting between one click and the next is
irrelevant: at the third link, access is blocked.
Nice to not be the only one with this problem :)
And thank you for bringing it up.
--
Siep Kroonenberg
Denis Bitouzé
2018-02-12 12:52:57 UTC
Permalink
Post by Siep Kroonenberg
Post by Denis Bitouzé
Post by Enrico Gregorio
Happens regularly also to me. Waiting between one click and the next is
irrelevant: at the third link, access is blocked.
Nice to not be the only one with this problem :)
And thank you for bringing it up.
You're welcome :)
--
Denis
Enrico Gregorio
2018-02-12 14:35:28 UTC
Permalink
An experiment.

1. Went to the “home page” http://tug.org/svn/texlive/trunk/Master/texmf-dist/?sortby=date <http://tug.org/svn/texlive/trunk/Master/texmf-dist/?sortby=date>

2. Waited 30 seconds

3. Clicked on the number at the top left (last revision), today 46605

4. Waited 30 seconds

5. Selected a revision number pasting it into the box; used 46586; clicked on “go”

6. Waited 30 seconds (and more)

7. Clicked on “text changed” relative to etoolbox.sty

Access blocked. I don’t think 30 seconds is so short a time for triggering the block.
And examining what has changed on a specific day (I’m curious, you know), requires
looking at several pages.

Ciao
Enrico
Hironobu Yamashita
2018-02-12 14:56:08 UTC
Permalink
Hi Enrico,

I have exactly the same situation in Japan.

I remember correctly that I had no problem on 2017-12-18,
[tex-live] www.tug.org unreachable
And I also remember correctly that I’ve been having
trouble at least since 2018-01-10. (according to my twitter)

Regards,
Hironobu
Reinhard Kotucha
2018-02-13 00:26:16 UTC
Permalink
Post by Norbert Preining
Post by Denis Bitouzé
Sometimes, I can temporarily reach pages such as
There is a block installed on tug that should catch robots. If you
are hitting to hard and quickly on the server/svn space, you will
be blocked for some time.
Really?

while true; do wget https://www.tug.org/texlive/lists.html; done

works fine here and

wget -r https://www.tug.org/texlive

too. I don't understand why Denis can't download lists.html at all.

BTW, blocking robots reliably without bothering normal users is a very
difficult task. It's probably better not to rely on the time between
two requests but on the number of requests within a certain amount of
time.

Wget fails when I replace "www.tug.org" with its IP number. This is
good because most robots simply csan


Regards,
Reinhard
--
------------------------------------------------------------------
Reinhard Kotucha Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover mailto:***@web.de
------------------------------------------------------------------
Reinhard Kotucha
2018-02-13 00:31:55 UTC
Permalink
Sorry, I sent the previous mail too early.
Post by Norbert Preining
Post by Denis Bitouzé
Sometimes, I can temporarily reach pages such as
There is a block installed on tug that should catch robots. If you
are hitting to hard and quickly on the server/svn space, you will
be blocked for some time.
Really?

while true; do wget https://www.tug.org/texlive/lists.html; done

works fine here and

wget -r https://www.tug.org/texlive

too. I don't understand why Denis can't download lists.html at all.

BTW, blocking robots reliably without bothering normal users is a very
difficult task. It's probably better not to rely on the time between
two requests but on the number of requests within a certain amount of
time.

Wget fails when I replace "www.tug.org" with its IP number. This is
good because most robots simply scan IP numbers.

Regards,
Reinhard
--
------------------------------------------------------------------
Reinhard Kotucha Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover mailto:***@web.de
------------------------------------------------------------------
M.Eng. René Schwarz
2018-02-13 21:25:39 UTC
Permalink
Hi Norbert,
Post by Norbert Preining
There is a block installed on tug that should catch robots. If you are
hitting to hard and quickly on the server/svn space, you will be blocked
for some time.
I just wanna add to the discussion that I am experiencing these blocks
not only for the SVN web viewer, but for the normal TUG website, too.
After three subsequent page requests my IP is being blocked. This is
very annoying.

With changing proxy servers I am able to reach the TUG webserver.


Kind regards,
René
--
Sincerely yours,


M.Eng. *René Schwarz*
https://www.rene-schwarz.com
Philip Taylor (RHUoL)
2018-02-12 09:49:59 UTC
Permalink
Post by Denis Bitouzé
Hi,
I still have troubles when trying to reach some pages of texlive
┌────
│ $ wget https://www.tug.org/texlive/lists.html
P:\>wget https://www.tug.org/texlive/lists.html
SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc
syswgetrc = C:\Program Files (x86)\GnuWin32/etc/wgetrc
--2018-02-12 09:48:49--  https://www.tug.org/texlive/lists.html
Resolving www.tug.org... 91.121.174.77
Connecting to www.tug.org|91.121.174.77|:443... connected.
ERROR: cannot verify www.tug.org's certificate, issued by
`/C=US/O=DigiCert Inc/
  Unable to locally verify the issuer's authority.
To connect to www.tug.org insecurely, use `--no-check-certificate'.
Unable to establish SSL connection.
Philip Taylor
Zdenek Wagner
2018-02-12 10:05:01 UTC
Permalink
It seems that wget for Windows does not search the system area for the SSL
certificates. DigiCert is certainly trusted for Windows browsers.


Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz
Post by Denis Bitouzé
Post by Denis Bitouzé
Hi,
I still have troubles when trying to reach some pages of texlive
┌────
│ $ wget https://www.tug.org/texlive/lists.html
P:\>wget https://www.tug.org/texlive/lists.html
Post by Denis Bitouzé
SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc
syswgetrc = C:\Program Files (x86)\GnuWin32/etc/wgetrc
--2018-02-12 09:48:49-- https://www.tug.org/texlive/lists.html
Resolving www.tug.org... 91.121.174.77
Connecting to www.tug.org|91.121.174.77|:443... connected.
ERROR: cannot verify www.tug.org's certificate, issued by
`/C=US/O=DigiCert Inc/
Unable to locally verify the issuer's authority.
To connect to www.tug.org insecurely, use `--no-check-certificate'.
Unable to establish SSL connection.
Philip Taylor
Karl Berry
2018-02-13 00:17:17 UTC
Permalink
It seems the fail2ban rate-limiting options I put in weeks ago, after
Denis's first report, didn't actually override the defaults. Sigh. I've
simply disabled the rate limiting inside /svn/ for now, since I have no
energy to debug it.

Of course I would not have bothered to spend^Wwaste my time setting it
up in the first place if the server had not been adversely affected by
crawlers in defiance of robots.txt. Whenever I look at the logs, useless
bots are almost all I see, and the big source repository takes a lot
more resources to serve to them than anything else. Oh well, will look
at it again when I have to. -k
Michael Shell
2018-02-13 02:46:01 UTC
Permalink
On Tue, 13 Feb 2018 00:17:17 GMT
Post by Karl Berry
It seems the fail2ban rate-limiting options I put in weeks ago, after
Denis's first report, didn't actually override the defaults. Sigh. I've
simply disabled the rate limiting inside /svn/ for now, since I have no
energy to debug it.
https://manpages.debian.org/unstable/fail2ban/jail.conf.5.en.html

there are four different types of fail2ban configuration files. So,
it sure is easy for something to go wrong there.

Also, the fail2ban service must be restarted (service fail2ban restart)
or "SIGHUPed" after a config file change.

It might save some trouble to just ask the fail2ban-users mailing list

https://sourceforge.net/p/fail2ban/mailman/fail2ban-users/

why a so-and-so config change is not having any effect.

Also, FWIW, there is a good fail2ban config article here:
https://www.digitalocean.com/community/tutorials/how-to-protect-ssh-with-fail2ban-on-ubuntu-14-04


Just my $0.02,

Mike Shell
Denis Bitouzé
2018-02-13 06:25:34 UTC
Permalink
Post by Karl Berry
It seems the fail2ban rate-limiting options I put in weeks ago, after
Denis's first report, didn't actually override the
defaults. Sigh. I've simply disabled the rate limiting inside /svn/
for now, since I have no energy to debug it.
I can understand. Anyway, seems to work nicely this morning: thanks!
Post by Karl Berry
Of course I would not have bothered to spend^Wwaste my time setting it
up in the first place if the server had not been adversely affected by
crawlers in defiance of robots.txt. Whenever I look at the logs,
useless bots are almost all I see, and the big source repository takes
a lot more resources to serve to them than anything else. Oh well,
will look at it again when I have to. -k
I sympathize very much with you!
--
Denis
Karl Berry
2018-02-13 23:22:44 UTC
Permalink
for the normal TUG website, too.

Which pages?

I recently set up rate-limiting for /mailman/ and /pipermail/ pages, and
that seemed ok with normal use in my testing. There is not and never has
been rate limiting anywhere else.

Also, the web server was up and down yesterday while I was trying out
all this insanity. -k
Karl Berry
2018-02-13 23:26:15 UTC
Permalink
for the normal TUG website, too.

Oh, I think after messing around on the mailman pages, those hits then
somehow apply to a future hit on the rest of the site. More pain. I
disabled the web rate-limiting entirely ... -k
M.Eng. René Schwarz
2018-02-18 10:16:10 UTC
Permalink
Hi Karl,
Post by Karl Berry
Oh, I think after messing around on the mailman pages, those hits then
somehow apply to a future hit on the rest of the site. More pain. I
disabled the web rate-limiting entirely ... -k
thank you for the reconfiguration of the server. Now I can confirm that
I experience no troubles reaching the tug.org website (all parts of it)
anymore.


Kind regards,
René
--
Sincerely yours,


M.Eng. *René Schwarz*
***@rene-schwarz.com
https://www.rene-schwarz.com
Denis Bitouzé
2018-02-18 11:36:35 UTC
Permalink
Hi,
Post by M.Eng. René Schwarz
Post by Karl Berry
Oh, I think after messing around on the mailman pages, those hits
then somehow apply to a future hit on the rest of the site. More
pain. I disabled the web rate-limiting entirely ... -k
thank you for the reconfiguration of the server. Now I can confirm
that I experience no troubles reaching the tug.org website (all parts
of it) anymore.
The same applies for me: thanks!
--
Denis
Loading...