mirror – Grey Panthers Savannah https://grey-panther.net Just another WordPress site Sun, 08 May 2022 11:39:11 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.2 206299117 Proxying pypi / npm / etc for fun and profit! https://grey-panther.net/2014/02/proxying-pypi-npm-etc-for-fun-and-profit.html https://grey-panther.net/2014/02/proxying-pypi-npm-etc-for-fun-and-profit.html#respond Wed, 05 Feb 2014 15:26:00 +0000 Package managers for source code (like pypi, npm, nuget, maven, gems, etc) are great! We should all use them. But what happens if the central repository goes down? Suddenly all your continious builds / deploys fail for no reason. Here is a way to prevent that:

Configure Apache as a caching proxy fronting these services. This means that you can tolerate downtime for the services and you have quicker builds (since you don’t need to contact remote servers). It also has a security benefit (you can firewall of your build server such that it can’t make any outgoing connections) and it’s nice to avoid consuming the bandwidth of those registries (especially since they are provided for free).

Without further ado, here are the config bits for Apache 2.4

/etc/apache2/force_cache_proxy.conf – the general configuration file for caching:

# Security - we don't want to act as a proxy to arbitrary hosts
ProxyRequests Off
SSLProxyEngine On
 
# Cache files to disk
CacheEnable disk /
CacheMinFileSize 0
# cache up to 100MB
CacheMaxFileSize 104857600
# Expire cache in one day
CacheMinExpire 86400
CacheDefaultExpire 86400
# Try really hard to cache requests
CacheIgnoreCacheControl On
CacheIgnoreNoLastMod On
CacheStoreExpired On
CacheStoreNoStore On
CacheStorePrivate On
# If remote can't be reached, reply from cache
CacheStaleOnError On
# Provide information about cache in reply headers
CacheDetailHeader On
CacheHeader On
 
# Only allow requests from localhost
<Location />
        Order Deny,Allow
        Deny from all
        Allow from 127.0.0.1
</Location>
 
<Proxy *>
        # Don't send X-Forwarded-* headers - don't leak local hosts
        # And some servers get confused by them
        ProxyAddHeaders Off
</Proxy>

# Small timeout to avoid blocking the build to long
ProxyTimeout    5

Now with this prepared we can create the individual configurations for the services we wish to proxy:

For pypi:

# pypi mirror
Listen 127.1.1.1:8001

<VirtualHost 127.1.1.1:8001>
        Include force_cache_proxy.conf

        ProxyPass         /  https://pypi.python.org/ status=I
        ProxyPassReverse  /  https://pypi.python.org/
</VirtualHost>

For npm:

# npm mirror
Listen 127.1.1.1:8000

<VirtualHost 127.1.1.1:8000>
        Include force_cache_proxy.conf

        ProxyPass         /  https://registry.npmjs.org/ status=I
        ProxyPassReverse  /  https://registry.npmjs.org/
</VirtualHost>

After configuration you need to enable the site (a2ensite) as well as needed modules (a2enmod – ssl, cache, disk_cache, proxy, proxy_http).

Finally you need to configure your package manager clients to use these endpoints:

For npm you need to edit ~/.npmrc (or use npm config set) and add registry = http://127.1.1.1:8000/

For Python / pip you need to edit ~/.pip/pip.conf (I recommend having download-cache as per Stavros’s post):

[global]
download-cache = ~/.cache/pip/
index-url = http://127.1.1.1:8001/simple/

If you use setuptools (why!? just stop and use pip :-)), your config is ~/.pydistutils.cfg:

[easy_install]
index_url = http://127.1.1.1:8001/simple/

Also, if you use buildout, the needed config adjustment in buildout.cfg is:

[buildout]
index = http://127.1.1.1:8001/simple/

This is mostly it. If your client is using any kind of local caching, you should clear your cache and reinstall all the dependencies to ensure that Apache has them cached on the disk. There are also dedicated solutions for caching the repositories (for example devpi for python and npm-lazy-mirror for node), however I found them somewhat unreliable and with Apache you have a uniform solution which already has things like startup / supervision implemented and which is familiar to most sysadmins.

]]>
https://grey-panther.net/2014/02/proxying-pypi-npm-etc-for-fun-and-profit.html/feed 0 9
GHDB mirror https://grey-panther.net/2009/01/ghdb-mirror.html https://grey-panther.net/2009/01/ghdb-mirror.html#respond Thu, 08 Jan 2009 15:29:00 +0000 https://grey-panther.net/?p=471 Seeing that the GHDB (Google Hacking DataBase) might soon disappear (the site was offline for weeks recently for example), I grabbed a mirror of it and put it up on a free hosting website (no, not that one) – enjoy it while it lasts:

  • the main page
  • a link to each individual entry – this was needed because the navigation system was based on javascript :-(, and HTTrack – although amazingly it was able to find all the links, it wasn’t able to modify the JS such that the navigation works.
  • If you want to download all the pages at one, grab them here (the extension is .JAR, but in fact it is just a ZIP file – as all JAR files are ZIP files)
]]>
https://grey-panther.net/2009/01/ghdb-mirror.html/feed 0 471