Search: [web] - Alan's Bookmarks

Download an entire website with wget, along with assets.

# One liner
wget --recursive --page-requisites --adjust-extension --span-hosts --convert-links --restrict-file-names=windows --domains yoursite.com --no-parent yoursite.com

# Explained
wget \
     --recursive \ # Download the whole site.
     --page-requisites \ # Get all assets/elements (CSS/JS/images).
     --adjust-extension \ # Save files with .html on the end.
     --span-hosts \ # Include necessary assets from offsite as well.
     --convert-links \ # Update links to still work in the static version.
     --restrict-file-names=windows \ # Modify filenames to work in Windows as well.
     --domains yoursite.com \ # Do not follow links outside this domain.
     --no-parent \ # Don't follow links outside the directory you pass in.
         yoursite.com/whatever/path # The URL to download

web · linux

January 16, 2026 at 1:53:05 AM UTC * · permalink

·

https://gist.github.com/crittermike/fe02c59fed1aeebd0a9697cf7e9f5c0c

·

Acacia Tree Silhouette Favicon Information - FreeFavicon

Acacia Tree Silhouette Favicon

web

January 15, 2026 at 6:58:46 PM UTC * · permalink

·

https://freefavicon.com/freefavicons/plants/iconinfo/acacia-tree-silhouette-2-152-318424.html

·

During Helene, I Just Wanted a Plain Text Website

As a web developer, I am thinking again about my experience with the mobile web on the day after the storm, and the following week. I remember trying in vain to find out info about the storm damage and road closures—watching loaders spin and spin on blank pages until they timed out trying to load. Once in a while, pages would finally load or partially load, and I could actually click a second or third link. We had a tiny bit of service but not much. At one point we drove down our main street to find service; eventually finding cars congregating in a closed fast-food parking lot, where there were a few bars of service!

When I was able to load some government and emergency sites, problems with loading speed and website content became very apparent. We tried to find out the situation with the highways on the government site that tracks road closures. I wasn’t able to view the big slow loading interactive map and got a pop-up with an API failure message. I wish the main closures had been listed more simply, so I could have seen that the highway was completely closed by a landslide. //

During the outages, many people got information from the local radio station’s ongoing broadcasts. The best information I received came from an unlikely place: a simple bulleted list in a daily email newsletter from our local state representative. Every day that newsletter listed food and water, power and gas, shelter locations, road and cell service updates, etc.

I was struck by how something as simple as text content could have such a big impact.

In having the best information provided in a simple newsletter list, I found myself wishing for faster loading and more direct websites. Especially ones with this sort of info. At that time, even a plain text site with barely any styles or images would have been better.

web · internet

January 14, 2026 at 12:52:51 AM UTC * · permalink

·

https://sparkbox.com/foundry/helene_and_mobile_web_performance?lid=adbem0buckxu

·

wallabag/wallabag: wallabag is a self hostable application for saving web pages: Save and classify articles. Read them later. Freely.

wallabag is a self hostable application for saving web pages: Save and classify articles. Read them later. Freely.

internet · web · archive

January 8, 2026 at 2:18:23 AM UTC * · permalink

·

https://github.com/wallabag/wallabag?tab=readme-ov-file

·

Pay-per-output? AI firms blindsided by beefed up robots.txt instructions. - Ars Technica

“Really Simple Licensing” makes it easier for creators to get paid for AI scraping. //

Leading Internet companies and publishers—including Reddit, Yahoo, Quora, Medium, The Daily Beast, Fastly, and more—think there may finally be a solution to end AI crawlers hammering websites to scrape content without permission or compensation.

Announced Wednesday morning, the “Really Simple Licensing” (RSL) standard evolves robots.txt instructions by adding an automated licensing layer that’s designed to block bots that don’t fairly compensate creators for content.

Free for any publisher to use starting today, the RSL standard is an open, decentralized protocol that makes clear to AI crawlers and agents the terms for licensing, usage, and compensation of any content used to train AI, a press release noted.

internet · web · a/i

December 31, 2025 at 1:59:28 PM UTC * · permalink

·

https://arstechnica.com/tech-policy/2025/09/pay-per-output-ai-firms-blindsided-by-beefed-up-robots-txt-instructions/

·

In 1995, a Netscape employee wrote a hack in 10 days that now runs the Internet - Ars Technica

Thirty years ago today, Netscape Communications and Sun Microsystems issued a joint press release announcing JavaScript, an object scripting language designed for creating interactive web applications. The language emerged from a frantic 10-day sprint at pioneering browser company Netscape, where engineer Brendan Eich hacked together a working internal prototype during May 1995.

While the JavaScript language didn’t ship publicly until that September and didn’t reach a 1.0 release until March 1996, the descendants of Eich’s initial 10-day hack now run on approximately 98.9 percent of all websites with client-side code, making JavaScript the dominant programming language of the web. It’s wildly popular; beyond the browser, JavaScript powers server backends, mobile apps, desktop software, and even some embedded systems. According to several surveys, JavaScript consistently ranks among the most widely used programming languages in the world. //

The JavaScript partnership secured endorsements from 28 major tech companies, but amusingly, the December 1995 announcement now reads like a tech industry epitaph. The endorsing companies included Digital Equipment Corporation (absorbed by Compaq, then HP), Silicon Graphics (bankrupt), and Netscape itself (bought by AOL, dismantled). Sun Microsystems, co-creator of JavaScript and owner of Java, was acquired by Oracle in 2010. JavaScript outlived them all. //

Confusion about its relationship to Java continues: The two languages share a name, some syntax conventions, and virtually nothing else. Java was developed by James Gosling at Sun Microsystems using static typing and class-based objects. JavaScript uses dynamic typing and prototype-based inheritance. The distinction between the two languages, as one Stack Overflow user put it in 2010, is similar to the relationship between the words “car” and “carpet.” //

The language now powers not just websites but mobile applications through frameworks like React Native, desktop software through Electron, and server infrastructure through Node.js. Somewhere around 2 million to 3 million packages exist on npm, the JavaScript package registry.

programming · internet · web · history

December 7, 2025 at 2:49:27 AM UTC * · permalink

·

https://arstechnica.com/gadgets/2025/12/in-1995-a-netscape-employee-wrote-a-hack-in-10-days-that-now-runs-the-internet/

·

Why I use Caddy for every self-hosted service I run

One of the most anxiety-inducing parts of self-hosting for me is ensuring that everything is as locked-down security-wise as possible. That's become even more critical as I increase my footprint, adding my own domain and subdomains that point to each service. I'm also a little particular, and while I could use a self-signed TLS certificate to ensure HTTPS for the services that need it, the reminder that it hasn't been done "properly" every time I access those services irks me.

And while there's any number of reverse proxies that I could use to access those services, few are as easy to set up and use as Caddy. //

Officially, Caddy is an open-source web server that can be used for many things. But because it's so easy to set up and includes built-in automatic HTTPS with TLS certificate management, it's often used as a reverse proxy for the home lab. That's because every domain, IP address, and even localhost are served over HTTPS, thanks to the fully automated, self-managed certificate authority.

The entire server is controlled by a single configuration file, the "Caddyfile," which is human-readable, and most tasks are handled with a few simple lines of text.

servers · tls · web

November 20, 2025 at 9:19:20 PM UTC * · permalink

·

https://www.xda-developers.com/why-i-use-caddy-for-every-self-hosted-service/

·

Button Maker :: Adam Kalsey

Call it a badge, sticker, button, or whatever you'd like. Create yours below. Pick some colors, enter some text, and you'll get a button you can download for your site.

web · servers

October 23, 2025 at 8:29:29 PM UTC * · permalink

·

https://kalsey.com/tools/buttonmaker/

·

Inside the web infrastructure revolt over Google’s AI Overviews - Ars Technica

Announced on September 24, Cloudflare’s Content Signals Policy is an effort to use the company’s influential market position to change how content is used by web crawlers. It involves updating millions of websites’ robots.txt files. //

Historically, robots.txt simply includes a list of paths on the domain that were flagged as either “allow” or “disallow.” It was technically not enforceable, but it became an effective honor system because there are advantages to it for the owners of both the website and the crawler: Website owners could dictate access for various business reasons, and it helped crawlers avoid working through data that wouldn’t be relevant. //

The Content Signals Policy initiative is a newly proposed format for robots.txt that intends to do that. It allows website operators to opt in or out of consenting to the following use cases, as worded in the policy:

search: Building a search index and providing search results (e.g., returning hyperlinks and short excerpts from your website’s contents). Search does not include providing AI-generated search summaries.
ai-input: Inputting content into one or more AI models (e.g., retrieval augmented generation, grounding, or other real-time taking of content for generative AI search answers).
ai-train: Training or fine-tuning AI models.

Cloudflare has given all of its customers quick paths for setting those values on a case-by-case basis. Further, it has automatically updated robots.txt on the 3.8 million domains that already use Cloudflare’s managed robots.txt feature, with search defaulting to yes, ai-train to no, and ai-input blank, indicating a neutral position.

internet · web

October 17, 2025 at 12:23:49 PM UTC * · permalink

·

https://arstechnica.com/ai/2025/10/inside-the-web-infrastructure-revolt-over-googles-ai-overviews/

·

Create a Phishy URL

Are your links not malicious looking enough?
This tool is guaranteed to help with that!

What is this and what does it do?
This is a tool that takes any link and makes it look malicious. It works on the idea of a redirect. Much like https://tinyurl.com/ for example. Where tinyurl makes an url shorter, this site makes it look malicious.

Place any link in the below input, press the button and get back a fishy(phishy, heh...get, it?) looking link. The fishy link doesn't actually do anything, it will just redirect you to the original link you provided.

internet · web · infosec

October 3, 2025 at 5:53:28 AM UTC * · permalink

·

https://phishyurl.com/

·

How to choose a domain name

Selecting the domain name(s) for your web site is one of the most important decisions you will need to make when starting a new web site. It´s probably the first thing you will do, yet has some of the most lasting implications.

domains · web

May 28, 2025 at 6:59:29 PM UTC * · permalink

·

https://www.pawprint.net/webdesign/domain-name-howto.php

·

The best Logo Generator (completely free) - favicon.io

Generate a logo by configuring the settings below. Download your logo in a variety of layouts and formats.

web · graphics

April 6, 2025 at 12:49:09 AM UTC * · permalink

·

https://favicon.io/logo-generator/

·

Favicon Generator - Text to Favicon - favicon.io

Quickly generate your favicon from text by selecting the text, fonts, and colors. Download your favicon in the most up to date formats.

web · graphics

April 6, 2025 at 12:46:00 AM UTC * · permalink

·

https://favicon.io/favicon-generator/

·

Favicon Generator - Image to Favicon - favicon.io

Quickly generate your favicon from an image by uploading your image below. Download your favicon in the most up to date formats.

web · graphics

April 6, 2025 at 12:45:36 AM UTC * · permalink

·

https://favicon.io/favicon-converter/

·

ntfy.sh | Send push notifications to your phone via PUT/POST

ntfy (pronounced notify) is a simple HTTP-based pub-sub notification service. It allows you to send notifications to your phone or desktop via scripts from any computer, and/or using a REST API. It's infinitely flexible, and 100% free software.

software · opensource · communication · web

December 29, 2024 at 12:59:26 AM UTC * · permalink

·

https://ntfy.sh/

·

listmonk / Documentation

listmonk is a self-hosted, high performance one-way mailing list and newsletter manager. It comes as a standalone binary and the only dependency is a Postgres database.

email · internet · web · software

December 8, 2024 at 7:34:30 AM UTC * · permalink

·

https://listmonk.app/docs/

·

Open-Source Static CMS for Fast, Secure, GDPR & CCPA-Compliant Websites

Creating a website doesn't have to be complicated or expensive. With the Publii app, the most intuitive CMS for static sites, you can create a beautiful, safe, and privacy-friendly website quickly and easily; perfect for anyone who wants a fast, secure website in a flash. //

The goal of Publii is to make website creation simple and accessible for everyone, regardless of skill level. With an intuitive user interface and built-in privacy tools, Publii combines powerful and flexible options that make it the perfect platform for anyone who wants a hassle-free way to build and manage a blog, portfolio or documentation website.

web · internet · wordpress

December 8, 2024 at 6:57:28 AM UTC · permalink

·

https://getpublii.com/

·

QR Code Management System - QR Gateway

Create QR Codes & Shortcuts

web · yourls

November 24, 2024 at 1:31:06 AM UTC * · permalink

·

https://app.qrgateway.com/dashboard

·

listmonk / Documentation

listmonk is a self-hosted, high performance one-way mailing list and newsletter manager. It comes as a standalone binary and the only dependency is a Postgres database. //

Simple API to send arbitrary transactional messages to subscribers using pre-defined templates. Send messages as e-mail, SMS, Whatsapp messages or any medium via Messenger interfaces.

Manage millions of subscribers across many single and double opt-in one-way mailing lists with custom JSON attributes for each subscriber. Query and segment subscribers with SQL expressions.

Use the fast bulk importer (~10k records per second) or use HTTP/JSON APIs or interact with the simple table schema to integrate external CRMs and subscriber databases.

Write HTML e-mails in a WYSIWYG editor, Markdown, raw syntax-highlighted HTML, or just plain text.

Use the media manager to upload images for e-mail campaigns on the server's filesystem, Amazon S3, or any S3 compatible (Minio) backend.

email · internet · web · software

November 1, 2024 at 3:44:33 AM UTC * · permalink

·

https://listmonk.app/

·

Smarter than 'Ctrl+F': Linking Directly to Web Page Content

Historically, we could link to a certain part of the page only if that part had an ID. All we needed to do was to link to the URL and add the document fragment (ID). If we wanted to link to a certain part of the page, we needed to anchor that part to link to it. This was until we were blessed with the Text fragments!
What are Text fragments?

Text fragments are a powerful feature of the modern web platform that allows for precise linking to specific text within a web page without the need to add an anchor! This feature is complemented by the ::target-text CSS pseudo-element, which provides a way to style the highlighted text.

Text fragments work by appending a special syntax to the end of a URL; just like we used to append the ID after the hash symbol (#). The browser interprets this part of the URL, searches for the specified text on the page, and then scrolls to and highlights that text if it supports text fragments. If the user attempts to navigate the document by pressing tab, the focus will move on to the next focusable element after the text fragment.
How can we use it?

Here’s the basic syntax for a text fragment URL:

 https://example.com/page.html#:~:text=[prefix-,]textStart[,textEnd][,-suffix]

Following the hash symbol, we add this special syntax :~: also known as fragment directive then text= followed by:

prefix-: A text string preceded by a hyphen specifying what text should immediately precede the linked text. This helps the browser to link to the correct text in case of multiple matches. This part is not highlighted.
textStart: The beginning of the text you’re highlighting.
textEnd: The ending of the text you’re highlighting.
-suffix: A hyphen followed by a text string that behaves similarly to the prefix but comes after the text. Aslo helpful when multiple matches exist and doesn’t get highlighted with the linked text.

web

October 30, 2024 at 5:58:02 PM UTC * · permalink

·

https://alfy.blog/2024/10/19/linking-directly-to-web-page-content.html

·