Skip to content

Count number of authors #13

@nichtich

Description

@nichtich

The number of identified authors can be counted with P50:

$ zcat 20171120/wikidata-20171120-publications.ndjson.gz | \
    jq .claims.P50[]? -r | uniq | sort | uniq | wc -l
120821

this takes 7 minutes to run on my machine. Indexing the whole dataset in a database should be faster and more flexible for additional analytics. For instance the number of identified author statements:

$ zcat 20171120/wikidata-20171120-publications.ndjson.gz | jq .claims.P50[]? -r | wc -l
974191

The number of unidentified author statements with P2093 can be counted in the same way:

$ zcat 20171120/wikidata-20171120-publications.ndjson.gz | jq .claims.P2093[]? -r | wc -l
43206518

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions