Post

Self-host Karakeep for bookmarks and AI-assisted archives

A secret-free Karakeep deployment pattern with Meilisearch, Chrome crawling, and OpenAI-compatible inference settings.

Self-host Karakeep for bookmarks and AI-assisted archives

Karakeep is a self-hosted bookmark manager and archive system. It can save links, crawl pages, search with Meilisearch, and optionally use AI models to summarize or tag content.

In this lab it runs as a three-container stack:

1
2
3
Karakeep app
  ├── Meilisearch for search
  └── Chrome container for crawling/screenshot tasks

Folder layout

1
2
3
4
/home/ubuntu/bookmarks/
├── docker-compose.yml
├── .env
└── data/

Compose pattern

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
services:
  karakeep:
    image: ghcr.io/karakeep-app/karakeep:release
    container_name: bookmarks-karakeep
    restart: unless-stopped
    env_file:
      - .env
    depends_on:
      - meilisearch
      - chrome
    networks:
      - proxy
      - bookmarks_internal
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=proxy"
      - "traefik.http.routers.bookmarks-secure.entrypoints=https"
      - "traefik.http.routers.bookmarks-secure.rule=Host(`bookmarks.<your-domain>`)"
      - "traefik.http.routers.bookmarks-secure.tls=true"
      - "traefik.http.routers.bookmarks-secure.tls.certresolver=cloudflare"
      - "traefik.http.routers.bookmarks-secure.service=bookmarks"
      - "traefik.http.services.bookmarks.loadbalancer.server.port=3000"

  meilisearch:
    image: getmeili/meilisearch:v1.41.0
    container_name: bookmarks-meilisearch
    restart: unless-stopped
    environment:
      MEILI_NO_ANALYTICS: "true"
      MEILI_MASTER_KEY: ${MEILI_MASTER_KEY}
    volumes:
      - ./data/meilisearch:/meili_data
    networks:
      - bookmarks_internal

  chrome:
    image: gcr.io/zenika-hub/alpine-chrome:124
    container_name: bookmarks-chrome
    restart: unless-stopped
    command: ["chromium-browser", "--headless", "--no-sandbox", "--disable-gpu", "--remote-debugging-address=0.0.0.0", "--remote-debugging-port=9222"]
    networks:
      - bookmarks_internal

networks:
  proxy:
    external: true
  bookmarks_internal:
    internal: true

Environment file

NEXTAUTH_SECRET=<generate-a-long-secret>
MEILI_MASTER_KEY=<generate-a-long-secret>
NEXTAUTH_URL=https://bookmarks.<your-domain>
MEILI_ADDR=http://meilisearch:7700
BROWSER_WEB_URL=http://chrome:9222

# Optional OpenAI-compatible inference
OPENAI_BASE_URL=https://api.example.com/openai/v1
OPENAI_API_KEY=<your-api-key>
INFERENCE_TEXT_MODEL=<model-that-supports-json-schema>
INFERENCE_IMAGE_MODEL=<vision-capable-model-if-needed>

Do not publish the real values.


Important AI lesson

Karakeep inference can require structured JSON outputs. Not every OpenAI-compatible model supports response_format: json_schema.

If your logs show model failures around structured output, switch to a model/provider that supports JSON schema responses.


Verification

1
2
3
4
cd /home/ubuntu/bookmarks
docker compose ps
docker compose logs --tail=100 karakeep
curl -I https://bookmarks.<your-domain>

Expected public HTTP behavior:

1
HTTP/2 200

or a redirect to the app’s own login page.


Backups

The important data is usually under the mounted data folder and the .env file.

1
tar -czf bookmarks-backup-$(date +%Y%m%d).tar.gz docker-compose.yml .env data/

Store the archive somewhere private because it contains secrets.

These posts connect to this topic and help build the bigger homelab picture:

This post is licensed under CC BY 4.0 by the author.