Code Documentation ================== The package ``filehooks`` implements the main functionality for the backend of the CodeAbility Sharing Platform. Since relative imports do not work without any issues, this module has to be installed in GitLab. This package can be installed with ``pip`` using the following command. .. code-block:: bash pip3 install . .. note:: When installing this package manually the api token for GitLab, email username and email password have to be set in the ``conf.production.ini`` before the installation! filehooks module --------------- .. automodule:: filehooks :members: :undoc-members: :show-inheritance: scripts ------- Collection of all scripts which can be installed in GitLab to extend its functionality. .. note:: Those scripts assume that the package ``filehooks`` is installed! .. toctree:: :maxdepth: 1 trigger_project_update Tests ----- Tests are written using the Python testing framework ``pytest``. Unit tests ~~~~~~~~~~ Unit tests can be found in ``tests/filehooks`` and can be executed manually by the following command: .. code-block:: bash pytest --cov-report term-missing --cov=filehooks/ tests/filehooks After each alteration of the code, it should be ensured that a code coverage of 100% is available. On every push event, unit tests are automatically executed by GitLab CI/CD. Integration tests ~~~~~~~~~~~~~~~~~ Since the code dependents heavily on the GitLab behavior, which was mocked in the unit tests, integration tests are provided to check that the entire system works as expected. .. note:: The integration tests' current implementation is not very robust, meaning that some tests may fail when executed too fast even though they would pass if waited long enough. The integration tests are not executed automatically by GitLab CI/CD. Those tests can be executed locally by running ``./run_integration_tests.sh``. In order to run the integration tests in isolation, they run in a dedicated set of containers. These are created automatically upon calling the script mentioned above, if they do not exist already. They have the same names as the containers which are created for production, but with the postfix ``_integration``. Data used by these containers and the container running the integration tests is stored in ``/tmp/sharing/integration/`` by default. This location can be configured in the ``run_integration_tests.sh`` script. .. note:: Some tests use the configuration file ``filehooks/conf/conf.test.ini``. Please ensure that the correct values are set before test execution. .. note:: The containers use quite a lot of memory, so running the integration tests while the normal set of containers is running could be problematic on systems with too little memory/swap space. Linter ------ The tools ``pylint`` and ``flake-8`` are used for static code analysis. Experience shows that ``pylint`` is more strict and verbose. However, if ``flake-8`` finds a potential issue, it is worth checking it out. Some default settings of the tools had to be adjusted. It should be ensured that the ``pylint``-score always reaches 10 points. If a potential issue is fine the way it is, suppress it. Automated code checks --------------------- This project uses git hooks to automatically check the code. Git hooks are scripts which are automatically executed upon certain git events. They can be installed on the client side (the developers machine) and on the server side (the git server which hosts the repository, e.g. a GitLab instance). Installation ~~~~~~~~~~~~ For development, a client side hooks is used to check code about to be committed. This is done by a pre-commit hook. Client side hooks need to be installed by the developer an their machine: 1. Install the python package ``pre-commit``, e.g. by running ``pip install pre-commit``. For other installation methods, see https://pre-commit.com/#installation. 2. Go to the root directory of the project and run ``pre-commit install``. Working with code checks in place ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When these two steps succeeded, git will automatically run some checks on the code before every commit. If any check fails, the commit is aborted. Some checks which auto-format files fail if they do any formatting. When this happens, the changes they do are not put into the git index and need to be added manually, e.g. by running ``git add `` in order to include them in the commit. Some IDEs which allow committing from within them might not do a good job at displaying the error messages. Running ``git commit`` from a terminal should give much better feedback, including colored messages and information about which check is running at the moment. Checks in use ~~~~~~~~~~~~~ The pre commit hooks runs several checks. The detailed configuration is found in the pre-commit configuration file ``.pre-commit-config.yaml``. These checks are run: - some generic hooks (pre-commit defaults) - ``isort`` sorts import statements (changes are not added to git index automatically) - ``black`` formats code (changes are not added to git index automatically) - ``mypy`` checks type annotations - ``pyright`` checks type annotations - ``flake8`` linter - ``pylint`` linter - ``pytest`` runs unit tests Please keep in mind that the integration tests are not run automatically. Since it takes several minutes to run them, it would not be reasonable to include them here. Disabling checks ~~~~~~~~~~~~~~~~ Sometimes it might be necessary to temporarily disable on of the checks. One way to do it is forcing the commit, which just skips all checks. However, this is discouraged, since most checks will probably be useful. For example, when pylint finds an issue in the code which you intend to fix in a later commit it still makes sense to auto-format the code. In such cases the ``SKIP`` environment variable can be used: ``SKIP=pylint,flake8 git commit`` sets this variable and initiates a commit. The variable takes a comma-separated list of checks to skip. The names of the tools can be found in the configuration file for pre-commit, ``.pre-commit-config.yaml``. There looks for the values of ``id`` keys. GitLab CI ~~~~~~~~~ Since the git-hook based checks need to be installed by each developer on their machine, it could happen that someone forgets to use them. GitLab continuous integration jobs are used to run most of the same checks in our GitLab instance. This requires a GitLab runner to be installed. The configuration is found in ``.gitlab-ci.yml``. Each push to GitLab starts a pipeline which runs the configured tools. Linter failures are indicated as warnings, unit test failures as errors which abort the pipeline. It is possible to view the command line output of the tool in GitLab, and in some cases artifacts can be downloaded. Commandline Interface --------------------- A command-line interface for elastic search index manipulation is provided. It supports the following operations: - List indexes (``list-index``) - Create index (``create-index``) - Delete index (``delete-index``) - Delete unused MetaData indices (``delete-unused-indices``) - List aliases (``list-alias``) - Switch aliases (``switch-alias``) - Reindex (``reindex``) Workflow ~~~~~~~~ 1. Start a shell in the ``sharing_gitlab`` docker container .. code-block:: bash docker exec -it sharing_gitlab /bin/bash 2. Navigate to ``utils`` (e.g., ``/file-hooks-src/utils``) 3. Run ``python3 cli.py`` with the appropriate arguments .. note:: Running ``python3 cli.py`` with no arguments yields the help output. Commands ~~~~~~~~ ``list-index`` ^^^^^^^^^^^^^^ - *Functionality:* List all elasticsearch indexes - *Usage:* .. code-block:: shell python3 cli.py list-index ``create-index`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - *Functionality:* Creates new indexes for metadata information - *Usage:* .. code-block:: shell python3 cli.py create-index \ -idx-metadata IDX_METADATA - *Arguments:* - IDX_METADATA: Name of new index for metadata information ``delete-index`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - *Functionality:* Deletes elasticsearch index - *Usage:* .. code-block:: shell python3 cli.py delete-index IDX - *Arguments:* - IDX: Name of index to be deleted ``delete-unused-indices`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - *Functionality:* Deletes currently unused metadata indices in elasticsearch - *Usage:* .. code-block:: shell python3 cli.py delete-unused-indices ``list-alias`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - *Functionality:* List all elasticsearch aliases - *Usage:* .. code-block:: shell python3 cli.py list-alias ``switch-alias`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - *Functionality:* Removes the alias for the old metadata index and adds a new alias for the new metadata index - *Usage:* .. code-block:: shell python3 cli.py switch-alias \ --old-idx-metadata OLD_IDX_METADATA \ --new-idx-metadata NEW_IDX_METADATA - *Arguments:* - OLD_IDX_METADATA: Name of new index for metadata information - NEW_IDX_METADATA: Name of new index for metadata information .. note:: - Running ``python3 cli.py -h`` yields the help output. - Running ``python3 cli.py -h`` yields the help for the specified command. ``reindex`` ^^^^^^^^^^^ - *Functionality:* Creates a new index for metadata, fills them and switches the alias. - *Usage:* .. code-block:: shell python3 cli.py reindex Example ~~~~~~~ Assuming you executed steps 1 and 2 from the workflow. Your workflow for reindexing might look like: .. code-block:: $ python3 cli.py list-index list-index health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open idx_metadata_0 KGQ0aWacREKM44aEYLWkNA 1 1 4 3 41.1kb 41.1kb $ python3 cli.py list-alias list-alias alias index filter routing.index routing.search is_write_index metadata idx_metadata_0 - - - true $ python3 cli.py create-index idx_metadata_1 $ python3 cli.py list-index list-index health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open idx_metadata_1 LaA64koiRESp4eF-Of1GTA 1 1 4 0 27.9kb 27.9kb yellow open idx_metadata_0 KGQ0aWacREKM44aEYLWkNA 1 1 4 3 41.1kb 41.1kb $ python3 cli.py switch-alias --old-idx-metadata idx_metadata_0 --new-idx-metadata idx_metadata_1 switch-alias You are about to switch the following alias: 'idx_metadata_0 -> 'idx_metadata_1' Would you like to continue? [Y/n] Y The aliases were switched! $ python3 cli.py list-alias list-alias alias index filter routing.index routing.search is_write_index metadata idx_metadata_1 - - - true $ python3 cli.py delete-index idx_metadata_0 delete-index You are about to delete the indexes in the list ['idx_metadata_0']. Would you like to continue? [Y/n] Y The indexes in the list ['idx_metadata_0'] were deleted! .. warning:: If users add content between the index creation and switching of the alias, this content might not be indexed until it is changed again!