Skip to content

SSO 与 OAuth 认证

配置 Google OAuth、GitHub OAuth 和通用 OIDC 以实现安全的生产环境认证。

SSO 与 OAuth 认证

v2.3.0 新增

默认情况下,Potato 使用简单的用户名登录,标注者输入任意用户名即可登录。这对于本地开发和小型项目来说很方便,但生产环境的部署需要适当的认证来防止未授权访问、确保标注者身份并符合机构要求。

Potato 2.3 增加了对三种认证方法的支持:

  1. Google OAuth -- 使用 Google 账户登录,可选域名限制
  2. GitHub OAuth -- 使用 GitHub 账户登录,可选组织限制
  3. 通用 OIDC -- 连接任何 OpenID Connect 提供商(Okta、Azure AD、Auth0、Keycloak 等)

所有方法都可以与 Potato 现有的用户名登录结合使用,实现混合模式配置。

Google OAuth

前提条件

  1. Google Cloud Console 中创建项目
  2. 启用 "Google Identity" API
  3. 创建 OAuth 2.0 凭据(Web 应用类型)
  4. 将你的 Potato 服务器 URL 添加到"已授权的重定向 URI":https://your-server.com/auth/google/callback

配置

yaml
authentication:
  method: google_oauth
 
  google_oauth:
    client_id: ${GOOGLE_CLIENT_ID}
    client_secret: ${GOOGLE_CLIENT_SECRET}
    redirect_uri: "https://your-server.com/auth/google/callback"
 
    # Optional: restrict to specific domain(s)
    allowed_domains:
      - "umich.edu"
      - "research-lab.org"
 
    # Optional: auto-register new users on first login
    auto_register: true
 
    # Optional: map Google profile fields to Potato user fields
    field_mapping:
      username: email            # use email as Potato username
      display_name: name         # show Google display name

域名限制

当设置 allowed_domains 时,只有来自这些域名的电子邮件地址的用户才能登录。其他用户会看到错误消息:

yaml
authentication:
  method: google_oauth
  google_oauth:
    client_id: ${GOOGLE_CLIENT_ID}
    client_secret: ${GOOGLE_CLIENT_SECRET}
    allowed_domains:
      - "umich.edu"
    domain_error_message: "This annotation task is restricted to University of Michigan accounts."

GitHub OAuth

前提条件

  1. 前往 GitHub Settings > Developer settings > OAuth Apps > New OAuth App
  2. 将 "Authorization callback URL" 设置为 https://your-server.com/auth/github/callback
  3. 记下你的 Client ID 并生成 Client Secret

配置

yaml
authentication:
  method: github_oauth
 
  github_oauth:
    client_id: ${GITHUB_CLIENT_ID}
    client_secret: ${GITHUB_CLIENT_SECRET}
    redirect_uri: "https://your-server.com/auth/github/callback"
 
    # Optional: restrict to members of specific GitHub organizations
    allowed_organizations:
      - "my-research-lab"
      - "university-nlp-group"
 
    # Optional: restrict to specific teams within an organization
    allowed_teams:
      - "my-research-lab/annotators"
 
    # Scopes to request
    scopes:
      - "read:user"
      - "read:org"             # needed for organization restriction
 
    auto_register: true
 
    field_mapping:
      username: login            # GitHub username
      display_name: name

组织限制

GitHub OAuth 可以将访问限制为特定组织的成员。这需要 read:org 权限范围:

yaml
authentication:
  method: github_oauth
  github_oauth:
    client_id: ${GITHUB_CLIENT_ID}
    client_secret: ${GITHUB_CLIENT_SECRET}
    allowed_organizations:
      - "my-research-lab"
    scopes:
      - "read:user"
      - "read:org"
    org_error_message: "You must be a member of the my-research-lab GitHub organization."

通用 OIDC

对于企业 SSO 提供商(Okta、Azure AD、Auth0、Keycloak 等),使用通用 OpenID Connect 集成。

配置

yaml
authentication:
  method: oidc
 
  oidc:
    # Discovery URL (provider's .well-known endpoint)
    discovery_url: "https://accounts.example.com/.well-known/openid-configuration"
 
    # Or specify endpoints manually
    # authorization_endpoint: "https://accounts.example.com/authorize"
    # token_endpoint: "https://accounts.example.com/token"
    # userinfo_endpoint: "https://accounts.example.com/userinfo"
    # jwks_uri: "https://accounts.example.com/.well-known/jwks.json"
 
    client_id: ${OIDC_CLIENT_ID}
    client_secret: ${OIDC_CLIENT_SECRET}
    redirect_uri: "https://your-server.com/auth/oidc/callback"
 
    scopes:
      - "openid"
      - "profile"
      - "email"
 
    auto_register: true
 
    field_mapping:
      username: preferred_username
      display_name: name
      email: email

Azure AD 示例

yaml
authentication:
  method: oidc
 
  oidc:
    discovery_url: "https://login.microsoftonline.com/${AZURE_TENANT_ID}/v2.0/.well-known/openid-configuration"
    client_id: ${AZURE_CLIENT_ID}
    client_secret: ${AZURE_CLIENT_SECRET}
    redirect_uri: "https://your-server.com/auth/oidc/callback"
    scopes:
      - "openid"
      - "profile"
      - "email"
    auto_register: true
    field_mapping:
      username: preferred_username
      display_name: name

Okta 示例

yaml
authentication:
  method: oidc
 
  oidc:
    discovery_url: "https://your-org.okta.com/.well-known/openid-configuration"
    client_id: ${OKTA_CLIENT_ID}
    client_secret: ${OKTA_CLIENT_SECRET}
    redirect_uri: "https://your-server.com/auth/oidc/callback"
    scopes:
      - "openid"
      - "profile"
      - "email"
      - "groups"               # request group membership
    auto_register: true
 
    # Restrict to specific groups
    allowed_groups:
      - "annotation-team"
 
    field_mapping:
      username: preferred_username
      display_name: name
      groups: groups

Keycloak 示例

yaml
authentication:
  method: oidc
 
  oidc:
    discovery_url: "https://keycloak.example.com/realms/annotation/.well-known/openid-configuration"
    client_id: ${KEYCLOAK_CLIENT_ID}
    client_secret: ${KEYCLOAK_CLIENT_SECRET}
    redirect_uri: "https://your-server.com/auth/oidc/callback"
    scopes:
      - "openid"
      - "profile"
      - "email"
    auto_register: true
    field_mapping:
      username: preferred_username
      display_name: name

域名限制

所有 OAuth 方法都支持基于电子邮件地址的域名限制。当你希望允许特定机构的任何账户时,这很有用:

yaml
authentication:
  method: google_oauth          # or github_oauth, oidc
 
  domain_restriction:
    enabled: true
    allowed_domains:
      - "umich.edu"
      - "stanford.edu"
    error_message: "Access is restricted to university accounts."

对于包含 email 声明的 OIDC 提供商,域名限制会自动生效。对于不包含电子邮件的提供商,你可能需要显式请求 email 权限范围。

自动注册

默认情况下,用户必须在 Potato 中预注册才能进行标注。启用 auto_register: true 后,任何成功认证的用户在首次登录时会自动在 Potato 中创建。

yaml
authentication:
  auto_register: true
  auto_register_role: annotator   # annotator or admin

如需预先批准特定用户并阻止其他人:

yaml
authentication:
  auto_register: false
  allowed_users:
    - "researcher@umich.edu"
    - "student1@umich.edu"
    - "student2@umich.edu"
  user_not_found_message: "Your account has not been approved. Contact the project administrator."

混合模式

你可以同时启用多种认证方法。登录页面为每种启用的方法显示按钮,并可选显示用户名字段。

yaml
authentication:
  methods:
    - google_oauth
    - github_oauth
    - username                   # keep simple username login as fallback
 
  google_oauth:
    client_id: ${GOOGLE_CLIENT_ID}
    client_secret: ${GOOGLE_CLIENT_SECRET}
    redirect_uri: "https://your-server.com/auth/google/callback"
 
  github_oauth:
    client_id: ${GITHUB_CLIENT_ID}
    client_secret: ${GITHUB_CLIENT_SECRET}
    redirect_uri: "https://your-server.com/auth/github/callback"
 
  # Username login settings
  username:
    enabled: true
    require_password: true       # require a password for username login
    password_file: "auth/passwords.yaml"

混合模式在迁移期间很有用:在用户名登录旁边启用 OAuth,然后在所有标注者关联了 OAuth 账户后禁用用户名登录。

会话管理

为已认证用户配置会话行为:

yaml
authentication:
  session:
    lifetime_hours: 24           # session duration
    refresh: true                # refresh session on activity
    cookie_secure: true          # require HTTPS for cookies
    cookie_samesite: "Lax"       # SameSite cookie attribute

管理员认证

管理员账户可以使用相同的 OAuth 流程或单独的认证方法:

yaml
authentication:
  method: google_oauth
  google_oauth:
    client_id: ${GOOGLE_CLIENT_ID}
    client_secret: ${GOOGLE_CLIENT_SECRET}
 
  admin:
    # Admins must match these emails
    admin_emails:
      - "pi@umich.edu"
      - "lead-ra@umich.edu"
 
    # Or use a separate API key for admin access
    api_key: ${ADMIN_API_KEY}

HTTPS 要求

OAuth 提供商在生产环境中要求重定向 URI 使用 HTTPS。对于本地开发,可以使用 HTTP 和 localhost:

yaml
# Development (HTTP allowed)
authentication:
  google_oauth:
    redirect_uri: "http://localhost:8000/auth/google/callback"
 
# Production (HTTPS required)
authentication:
  google_oauth:
    redirect_uri: "https://annotation.example.com/auth/google/callback"

对于反向代理(nginx、Caddy)后面的生产环境部署,确保代理转发正确的 X-Forwarded-Proto 头:

nginx
# nginx example
location / {
    proxy_pass http://localhost:8000;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header Host $host;
}

完整示例

完整的生产环境认证配置,包含 Google OAuth、域名限制和管理员 API 密钥:

yaml
task_name: "Annotation Project"
task_dir: "."
 
authentication:
  method: google_oauth
 
  google_oauth:
    client_id: ${GOOGLE_CLIENT_ID}
    client_secret: ${GOOGLE_CLIENT_SECRET}
    redirect_uri: "https://annotation.umich.edu/auth/google/callback"
    allowed_domains:
      - "umich.edu"
    auto_register: true
    field_mapping:
      username: email
      display_name: name
 
  session:
    lifetime_hours: 48
    cookie_secure: true
 
  admin:
    admin_emails:
      - "pi@umich.edu"
    api_key: ${ADMIN_API_KEY}
 
data_files:
  - "data/instances.jsonl"
 
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    labels: [Positive, Neutral, Negative]
 
output_annotation_dir: "output/"
output_annotation_format: "jsonl"

延伸阅读

有关实现详情,请参阅源文档