GitLab Cells Development Guidelines
For background of GitLab Cells, refer to the design document.
Available Cells / Organization schemas
Below are available schemas related to Cells and Organizations:
| Schema | Description |
|---|---|
gitlab_main (deprecated) |
This is being replaced with gitlab_main_org, for the purpose of building the Cells architecture. |
gitlab_main_cell (deprecated) |
All gitlab_main_cell tables are being moved to gitlab_main_org. gitlab_main_org is a better name for gitlab_main_cell - there is no functional difference between the two. |
gitlab_main_org |
Use for all tables in the main: database that are for an Organization. For example, projects and groups
|
gitlab_main_cell_setting |
All tables in the main: database related to cell settings. For example, application_settings. These cell-local tables should not have any foreign key references from/to organization tables. |
gitlab_main_cell_local |
For tables in the main: database that are related to features that is distinct for each cell. For example, zoekt_nodes, or shards. These cell-local tables should not have any foreign key references from/to organization tables. |
gitlab_ci |
Use for all tables in the ci: database that are for an Organization. For example, ci_pipelines and ci_builds
|
gitlab_ci_cell_local |
For tables in the ci: database that are related to features that is distinct for each cell. For example, instance_type_ci_runners, or ci_cost_settings. These cell-local tables should not have any foreign key references from/to organization tables. |
gitlab_main_user |
Schema for all User-related tables, ex. users, emails, etc. Most user functionality is organizational level so should use gitlab_main_org instead (e.g. commenting on an issue). For user functionality that is not organizational level, use this schema. Tables on this schema must strictly belong to a user. |
gitlab_shared_org |
Schema for tables with data across multiple databases and has organization_id for sharding. These tables inherit from Gitlab::Database::SharedModel. Tables in this schema are not allowed to use auto-incrementing integer schemas so that rows across the decomposed databases have unique primary keys. Use Composite, or UUID primary keys instead. |
gitlab_shared_cell_local |
Schema for cell local shared tables that do not require sharding and exist across multiple databases. For example, loose_foreign_keys_deleted_records. These tables also inherit from Gitlab::Database::SharedModel. |
Most tables will require a sharding key to be defined.
To understand how existing tables are classified, you can use this dashboard.
After a schema has been assigned, the merge request pipeline might fail due to one or more of the following reasons, which can be rectified by following the linked guidelines:
Creating a new schema
Schemas should default to require a sharding key, as features should be scoped to an Organization by default.
# db/gitlab_schemas/gitlab_ci.yaml
require_sharding_key: true
sharding_root_tables:
- projects
- namespaces
- organizations
Setting require_sharding_key to true means that tables assigned to that
schema will require a sharding_key to be set.
You will also need to configure the list of allowed sharding_root_tables that can be used as sharding keys for tables in this schema.
Database sequences
We ensure uniqueness of database sequences, across all cells.
This means the id columns of most tables will be unique.
For technical implementation and architecture decisions, refer to:
Unique constraints
If you require data to be unique, it should be scoped to be unique per
Organization, Group, Project, or User.
With the existence of multiple cells which each has its own independent
database, you can no longer rely on UNIQUE constraints.
You have two options:
- Ensure the index is scoped to include their
sharding_keyas one of the columns present in the index. - For the rare case where an attribute must be unique globally, across all organizations, use the Claim service.
Claim service
To use the claim service from Rails: Claiming an attribute for a cell
Static data
Problem: A database table is used to store static data. However, the primary key is not static because it uses an auto-incrementing sequence. This means the primary key is not globally consistent.
References to this inconsistent primary key will create problems because the reference clashes across cells / organizations.
Example: The plans table on a given Cell has the following data:
id | name | title
----+------------------------------+----------------------------------
1 | default | Default
2 | bronze | Bronze
3 | silver | Silver
5 | gold | Gold
7 | ultimate_trial | Ultimate Trial
8 | premium_trial | Premium Trial
9 | opensource | Opensource
4 | premium | Premium
6 | ultimate | Ultimate
10 | ultimate_trial_paid_customer | Ultimate Trial for Paid Customer
(10 rows)
On another cell, the plans table has differing ids for the same name:
id | name | title
----+------------------------------+------------------------------
1 | default | Default
2 | bronze | Bronze
3 | silver | Silver
4 | premium | Premium
5 | gold | Gold
6 | ultimate | Ultimate
7 | ultimate_trial | Ultimate Trial
8 | ultimate_trial_paid_customer | Ultimate Trial Paid Customer
9 | premium_trial | Premium Trial
10 | opensource | Opensource
This plans.id column is then used as a reference in the hosted_plan_id
column of gitlab_subscriptions table.
Solution: Use globally unique references, not a database sequence. If possible, hard-code static data in application code, instead of using the database.
In this case, the plans table can be dropped, and replaced with a fixed model
(details can be found in the configurable status design doc):
class Plan
include ActiveRecord::FixedItemsModel::Model
ITEMS = [
{:id=>1, :name=>"default", :title=>"Default"},
{:id=>2, :name=>"bronze", :title=>"Bronze"},
{:id=>3, :name=>"silver", :title=>"Silver"},
{:id=>4, :name=>"premium", :title=>"Premium"},
{:id=>5, :name=>"gold", :title=>"Gold"},
{:id=>6, :name=>"ultimate", :title=>"Ultimate"},
{:id=>7, :name=>"ultimate_trial", :title=>"Ultimate Trial"},
{:id=>8, :name=>"ultimate_trial_paid_customer", :title=>"Ultimate Trial Paid Customer"},
{:id=>9, :name=>"premium_trial", :title=>"Premium Trial"},
{:id=>10, :name=>"opensource", :title=>"Opensource"}
]
attribute :name, :string
attribute :title, :string
end
You can use model validations and use ActiveRecord-like methods like all, where, find_by and find:
Plan.find(4)
Plan.find_by(name: 'premium')
Plan.where(name: 'gold').first
The hosted_plan_id column will also be updated to refer to the fixed model's
id value.
You can also store associations with other models. For example:
class CurrentStatus < ApplicationRecord
belongs_to_fixed_items :system_defined_status, fixed_items_class: WorkItems::Statuses::SystemDefined::Status
end
Examples of hard-coding static data include:
- VisibilityLevel
- Static defaults for work item statuses
Ai::Catalog::BuiltInToolWorkItems::TypesFramework::SystemDefined::RelatedLinkRestriction
Other topics
See HTTP Router for routing. See Topology Service for cluster-wide service.