Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: scim #566

Open
wants to merge 56 commits into
base: main
Choose a base branch
from
Open

feat: scim #566

wants to merge 56 commits into from

Conversation

raphael-cohere
Copy link

@raphael-cohere raphael-cohere commented Aug 1, 2024

This PR adds SCIM functionality to the project. It allows an organisation to sync their users and groups from their idp.

Allows synchronisation of users and groups from Okta to toolkit. While SCIM is an open standard, this implementation was only tested with Okta. If we want to add more providers we would need to retest, but as long as they are following the spec it should be fine (similar to OIDC).

General Notes

  • this only works when OIDC is enabled. We match the users from the OIDC request to the users that were synchronised. (via external id)
  • this feature should only be activated if there were no users created so far, otherwise there could be side effects

Resources

PR Template

Thank you for contributing to the Cohere Toolkit!

  • PR title: "area: description"

    • Where "area" is whichever of interface, frontend, model, tools, backend, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes.
    • Example: "deployment: add Azure model option"
  • PR message: Delete this entire checklist and replace with

    • Description: a description of the change
    • Issue: the issue # it fixes, if applicable
    • Dependencies: any dependencies required for this change
  • Add tests and docs: Please include testing and documentation for your changes

  • Lint and test: Run make lint and make run-tests

AI Description

This PR introduces the System for Cross-domain Identity Management (SCIM) to the backend. SCIM is a standard for automating the exchange of user identity information between identity domains, such as between an enterprise identity management system and a service provider.

The changes include:

  • Adding SCIM configuration to the .env-template file.
  • Creating a new SCIMAuthValidation class in request_validators.py for SCIM authentication.
  • Adding SCIM-related imports and a new SCIM router in routers.py.
  • Adding SCIM-related fields to the secrets.template.yaml and settings.py files.
  • Creating new CRUD operations for groups and users in group.py and user.py, respectively.
  • Creating new SCIM exception handling and router in scim.py.
  • Adding SCIM-related imports and classes in schemas/scim.py.
  • Adding a ScimAuthValidation class in request_validators.py for SCIM authentication.
  • Updating the get_or_create_user function in utils.py to use the get_user_by_external_id function.
  • Adding SCIM-related tests in test_scim.py.
  • Adding SCIM-related configuration to the secrets.yaml file.

Copy link

cla-assistant bot commented Aug 1, 2024

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ mtanzim
❌ raphael-cohere
You have signed the CLA already but the status is still pending? Let us recheck it.

@mtanzim mtanzim changed the title feat: slim feat: scim Aug 1, 2024
raphael-cohere and others added 5 commits August 2, 2024 16:22
# Conflicts:
#	src/backend/config/routers.py
#	src/backend/main.py
#	src/backend/tests/factories/__init__.py
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 93.28358% with 36 lines in your changes missing coverage. Please review.

Project coverage is 83.50%. Comparing base (4edacee) to head (76eaf7d).
Report is 96 commits behind head on main.

Files Patch % Lines
src/backend/routers/scim.py 90.00% 13 Missing ⚠️
src/backend/tests/crud/test_group.py 79.48% 8 Missing ⚠️
src/backend/services/auth/utils.py 0.00% 6 Missing ⚠️
src/backend/alembic/versions/b6a70f628c24_.py 73.68% 5 Missing ⚠️
src/backend/services/auth/request_validators.py 91.66% 2 Missing ⚠️
src/backend/crud/group.py 97.14% 1 Missing ⚠️
src/backend/crud/user.py 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #566       /-   ##
==========================================
  Coverage   81.38%   83.50%    2.11%     
==========================================
  Files         199      243       44     
  Lines        8354    11037     2683     
==========================================
  Hits         6799     9216     2417     
- Misses       1555     1821      266     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Depends(get_session),
Depends(ScimAuthValidation()),
],
"auth": [],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why none in auth?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

misunderstand how the dependency system work, also added to auth now

async def get_users(
session: DBSessionDep,
count: int = 100,
start_index: int = 1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 1 not 0?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in the spec 🤷

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://datatracker.ietf.org/doc/html/rfc7644
startIndex The 1-based index of the first result in the current set
of list results. REQUIRED when partial results are returned due
to pagination.

) -> ListUserResponse:
if filter:
single_filter = filter.split(" ")
filter_value = single_filter[2].strip('"')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do validation here?

session: DBSessionDep,
count: int = 100,
start_index: int = 1,
filter: Optional[str] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have filter and the one below for username? Can you only filter by username?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For okta yes. That's all they support. Might be different for other providers.

async def get_groups(
session: DBSessionDep,
count: int = 100,
start_index: int = 1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same questions about start index, fitler (just by username) etc

src/backend/routers/scim.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants