CoCitation¶
The goal is to create a co-citation graph for a list of references.
from co_citation import CoCitation
cites = CoCitation(
[
"arxiv:1602.05112",
"pubmed:8113053",
"sciencedirect:S0167923610001703",
"scopus:10.1016/j.cmet.2020.11.014",
],
data_type="journal", # or "article", "institution"
wait=None, # None or the time to wait between requests (in seconds)
retries=None, # None or the number of retries for HTTPS requests
first_last_author=False, # Set to True to only get the institution of the first and last authors
)
cites.write_graph_edges("graph")
cites.plot_graph(
display=False,
k=10, # The spacing between the nodes
seed=42, # Use the seed argument for reproducibility
margin=dict(b=0, l=110, r=150, t=40)
)
-
class
co_citation.
CoCitation
(articles_list: List[str], sd_api_key: str = '', graph: str = '', node_weights: str = 'eigenvector', wait: Optional[int] = None, retries: Optional[int] = None, data_type: str = 'journal', first_last_author: bool = False)[source]¶ Create a co-citation graph
-
create_citation_graph
(articles_list: List[str]) → networkx.classes.graph.Graph[source]¶ Get the references of each article and their corresponding data (journal, article or institution)
Generate the co-citation pairs and add them the graph. The weights are the number of times the data are co-cited.
- Parameters
articles_list (list) – The list of articles URL. At the moment only arXiv, ScienceDirect and PubMed are supported
- Returns
The graph
- Return type
nx.Graph
-
filter_low_co_citations
(criteria: int) → None[source]¶ Remove low weight edges and isolated nodes
- Parameters
criteria (int) – The weight minimum in the resulting graph
-
filter_low_co_citations_nodes
(criteria: int) → None[source]¶ Remove low weight nodes
- Parameters
criteria (int) – The weight minimum in the resulting graph
-
static
gen_perms
(citations: List[str]) → List[List[Union[str, int]]][source]¶ Get all pair commutative permutations of a list
-
get_all_elsevier_refs
(api_refs_url, refs: List[str]) → List[str][source]¶ Get all references for an article indexed in scopus. The references are paginated by 40 so the function calls itself until the next API page.
-
get_article_institution_pubmed
(pmid: str) → List[str][source]¶ Get the institutions of an article indexed in semanticscholar
-
get_article_institution_scopus
(ref: bs4.element.Tag) → List[str][source]¶ Get the institutions of authors from a scopus reference
-
static
get_article_title_sem_scholar
(ref: dict) → str[source]¶ Get the title of an article indexed in semanticscholar
-
get_citations
(article_url: str) → List[str][source]¶ Get all citations data for an article
This function does two things:
Get the citations
For each citation, get the data (journal or article)
-
get_edge_trace
() → List[plotly.graph_objs._scatter.Scatter][source]¶ Generate the edges trace. The colors corresponds to the edge weights
- Returns
The list of edges trace
- Return type
-
get_journal_scopus
(ref: bs4.element.Tag) → str[source]¶ Get the journal of a scopus article :param ref: A scopus reference in a beautifulsoup Tag :type ref: Tag
- Returns
The journal’s name
- Return type
-
get_node_trace
() → dict[source]¶ Generate the nodes trace. The colors corresponds to the sum edge weights connected to the noes
- Returns
The nodes trace
- Return type
-
get_scopus_affiliation
(aff_id: str) → str[source]¶ Get the institutions of authors from a scopus reference
-
static
init_nodes_weight
(graph: networkx.classes.graph.Graph, criteria: str = 'eigenvector') → networkx.classes.graph.Graph[source]¶ Initialize the nodes weight. weights
- Parameters
graph (nx.Graph) – The graph
criteria (str) – The criteria for the weights. Must be one of “eigenvector” or “betweenness”
- Returns
The graph with the initialized nodes weight
- Return type
nx.Graph
-
static
load_abbreviations
() → Dict[str, str][source]¶ Get journal abbreviations
- Returns
The abbreviations
- Return type
-
plot_graph
(display=True, k=20, seed=42, margin={'b': 0, 'l': 5, 'r': 5, 't': 40}) → None[source]¶ Plot the co-citation graph
-