main

Entry point of the ReadNext command line tool

Imports

The command line interface is using typer, a library to build command line interfaces. We also use arxiv to query their search service to display the articles’ titles from the list of IDs proposed by the system.

Otherwise, we import all the internal modules of the project used to implement the different commands of the CLI.

import arxiv
import chromadb
import os
import typer
from dotenv import load_dotenv
from readnext import __version__
from readnext.arxiv_categories import exists, main, sub
from readnext.arxiv_sync import sync_arxiv
from readnext.embedding import embed_category_papers, download_embedding_model, embedding_system
from readnext.personalize import get_personalized_papers, save_personalized_papers_in_zotero
from rich import print
from typing_extensions import Annotated

Command line interface

version

The version command displays the current installed version of ReadNext.

version

 version ()

Get the current installed version of ReadNext

You can get the version number of the ReadNext instance installed of your machine by running:

readnext version

Configuration

Display the current configuration of ReadNext.

config

 config ()

Get the current configuration of ReadNext

You can display the current configuration uptions picked-up by ReadNext by running:

readnext config

arxiv-top-categories

The arxiv-top-categories command displays the complete list of ArXiv top categories. Note that the categories’ keys are case sensitive.

arxiv_top_categories

 arxiv_top_categories ()

Display ArXiv main categories. Keys are case sensitive.

You can get the list of all the top categories by using this command line:

readnext arxiv-top-categories

arxiv-sub-categories

The arxiv-sub-categories command displays the complete list of ArXiv sub categories. Note that the categories’ keys are case sensitive.

The arxiv sub categories are:

arxiv_sub_categories

 arxiv_sub_categories ()

Display ArXiv sub categories. Keys are case sensitive.

You can get the list of all the sub categories by using this command line:

readnext arxiv-sub-categories

personalized-papers

The personalized-papers command gives a list of personalized papers based on the user’s current research focus. That command has two required parameters and three optional:

category [required] : the ArXiv category to use to query the ArXiv search service. It can be a top or sub category, case sentitive.
focus_collection [required] : the name of the Zotero collection where all the user’s papers of interest are available for ReadNext.
proposals_collection [default: “”] : the name of the Zotero collection where the papers proposed by ReadNext will be added.
with_artifacts [default: False] : if set to True, the artifacts related to the proposed papers (PDF & summary files) will be added to Zotero.
nb_proposals [default: 10] : the number of papers that will be proposed by ReadNext.

To get new papers proposals, you have to run the personalized-papers command. That command requires two arguments:

category [required] : the arXiv top, or sub, category from which you want to get new papers proposals
zotero_collection [required] : the name of the Zotero collection where your papers of interest are stored in Zotero. This is what we refer to as the “Focus” collection above. The name of the collection is case sensitive and should be exactly as written in Zotero.

Then you also have three options available:

--proposals-collection [default: “”] : which tells ReadNext that you want to save the proposed papers in Zotero, in the Zotero Collection specified by the argument. If you don’t use this option, ReadNext will only print the proposed papers in the terminal, but will not save them in Zotero. The default behaviour is that you don’t save them in Zotero.
--with-artifacts / -a [default: False] : which tells ReadNext that you want to save the artifacts (PDF file of the papers and their summarization) into Zotero. This is the recommended workflow, but it requires a lot more space in your Zotero account. If you want to do this, you will most likely need to subscribe to one of their paid option.
--nb-proposals [default: 10] : which tells ReadNext how many papers you want to be proposed.

The following command will propose 3 papers from the cs.AI caterory, based on the Readnext-Focus-LLM collection in my Zotero library, save them in Zotero in the Readnext-Propositions-LLM with all related artifacts:

readnext personalized-papers cs.AI Readnext-Focus-LLM --proposals-collection=Readnext-Propositions-LLM --with-artifacts --nb-proposals=3

As you can see, you can easily create a series of topics you want papers proposals around, where each of the topic is defined by a series of specific papers that you read and found important for your research.

personalized_papers

 personalized_papers (category:str, focus_collection:str, proposals_collec
                      tion:typing.Annotated[str,<typer.models.OptionInfoob
                      jectat0x7f14ec87f910>]='', with_artifacts:typing.Ann
                      otated[bool,<typer.models.OptionInfoobjectat0x7f14ec
                      87f6d0>]=False, nb_proposals=10)

Get personalized papers of a focus-collection from an ArXiv category. If the category is all then all categories that have been locally synced will be used. if –proposals-collection is set, then the papers will be uploaded to the that Zotero collection, otherwise it will only be displayed to the command line.

Initialize

Before running the command line application, we have to make sure that the tool is properly initialized. The current initialization steps that are required are:

Load environment variables
Make sure that all the configuration options are properly set as environment variables.
Check that all the required local models artifacts are available on the local file system. If not, download them from their source.

config_check_one_exists

 config_check_one_exists (env_vars:list)

Check if one of the env_vars environment variables exists

config_exists

 config_exists (env_var:str)

Check if env_var environment variable exists

One thing that needs to be validated at initialization time is the shape of the embeddings in ChromaDB. If the user changed the setting EMBEDDING_SYSTEM from one system to another, then most likely that the number of dimentions will be different. If it is the case, then Chroma won’t be able to load the embeddings with a different dimention. This is why we have to warn the user.

get_embeddings_dimensions

 get_embeddings_dimensions (chroma_client, category:str)

Get the embedding dimensions of the given category

init

 init ()

Initialize the application