kb/create and /datasets

The kb/create used by the frontend and /datasets for creating datasets are essentially the same — they both call the same core service method to create datasets, but differ in interface design, parameter handling, and response format. kb_app.py:57-78 dataset.py:56-173


Detailed Analysis

Core Implementation is the Same

Both endpoints use the same core service method:

  1. kb/create (Legacy endpoint):
    • Calls KnowledgebaseService.create_with_name() to create the dataset configuration kb_app.py:63-68
    • Uses KnowledgebaseService.save() to persist to the database kb_app.py:74-76
  2. /datasets (SDK endpoint):

Key Differences

Feature kb/create /datasets
Route /v1/kb/create kb_app.py:57 /api/v1/datasets dataset.py:56
Auth Method @login_required kb_app.py:58 @token_required dataset.py:57
Param Validation @validate_request("name") kb_app.py:59 validate_and_parse_json_request(request, CreateDatasetReq) dataset.py:120
Response Format Returns {"kb_id": id} kb_app.py:76 Returns full dataset object processed by remap_dictionary_keys() dataset.py:169

Frontend Usage

The frontend uses these two endpoints in different scenarios:

  1. Legacy pages use kb/create:
  2. SDK/API calls use /datasets:

Unified Underlying Service

Both endpoints rely on the KnowledgebaseService.create_with_name() method, which is responsible for:

Conclusion

Although the two endpoints differ in interface design and usage scenarios, they are functionally equivalent — both serve as different entry points for creating datasets. /datasets is the more modern SDK interface, offering more complete parameter validation and response format, while kb/create is the legacy web interface primarily used for frontend page interaction.

Notes

  • Both endpoints use the same database model and service layer, ensuring data consistency
  • The frontend selects the appropriate endpoint based on the scenario, but the dataset object created is ultimately the same
  • The /datasets endpoint supports more parameter configurations, such as advanced features like automatic metadata extraction dataset.py:123-140

References

Opens

  1. current_user mean current login user, if multiple user login at same time, how to make sure the current user is desired user?