Dataset statistics
Number of variables | 8 |
---|---|
Number of observations | 70227 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 4.3 MiB |
Average record size in memory | 64.0 B |
Variable types
Numeric | 1 |
---|---|
Categorical | 7 |
Alerts
country has a high cardinality: 127 distinct values | High cardinality |
coverage has a high cardinality: 10044 distinct values | High cardinality |
issn has a high cardinality: 37076 distinct values | High cardinality |
publisher has a high cardinality: 11095 distinct values | High cardinality |
title has a high cardinality: 68293 distinct values | High cardinality |
country is highly imbalanced (55.1%) | Imbalance |
coverage is highly imbalanced (53.4%) | Imbalance |
publisher is highly imbalanced (52.0%) | Imbalance |
title is uniformly distributed | Uniform |
sourceid has unique values | Unique |
Reproduction
Analysis started | 2023-05-04 15:12:09.807657 |
---|---|
Analysis finished | 2023-05-04 15:12:13.789986 |
Duration | 3.98 seconds |
Software version | ydata-profiling vv4.1.2 |
Download configuration | config.json |
sourceid
Real number (ℝ)
Distinct | 70227 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1.3699065 × 1010 |
Minimum | 12000 |
---|---|
Maximum | 2.110106 × 1010 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 548.8 KiB |
Quantile statistics
Minimum | 12000 |
---|---|
5-th percentile | 16616.3 |
Q1 | 98868 |
median | 1.9900194 × 1010 |
Q3 | 2.1100457 × 1010 |
95-th percentile | 2.1100932 × 1010 |
Maximum | 2.110106 × 1010 |
Range | 2.1101048 × 1010 |
Interquartile range (IQR) | 2.1100358 × 1010 |
Descriptive statistics
Standard deviation | 9.3402116 × 109 |
---|---|
Coefficient of variation (CV) | 0.68181378 |
Kurtosis | -1.472561 |
Mean | 1.3699065 × 1010 |
Median Absolute Deviation (MAD) | 1.2006955 × 109 |
Skewness | -0.64060688 |
Sum | 9.6204427 × 1014 |
Variance | 8.7239552 × 1019 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
12000 | 1 | < 0.1% |
2.110032772 × 1010 | 1 | < 0.1% |
2.110032773 × 1010 | 1 | < 0.1% |
2.110032772 × 1010 | 1 | < 0.1% |
2.110032772 × 1010 | 1 | < 0.1% |
2.110032772 × 1010 | 1 | < 0.1% |
2.110032772 × 1010 | 1 | < 0.1% |
2.110032772 × 1010 | 1 | < 0.1% |
2.110032772 × 1010 | 1 | < 0.1% |
2.110032823 × 1010 | 1 | < 0.1% |
Other values (70217) | 70217 |
Value | Count | Frequency (%) |
12000 | 1 | |
12001 | 1 | |
12002 | 1 | |
12004 | 1 | |
12005 | 1 | |
12006 | 1 | |
12007 | 1 | |
12008 | 1 | |
12009 | 1 | |
12010 | 1 |
Value | Count | Frequency (%) |
2.110105979 × 1010 | 1 | |
2.110105978 × 1010 | 1 | |
2.110105978 × 1010 | 1 | |
2.110105949 × 1010 | 1 | |
2.11010593 × 1010 | 1 | |
2.11010593 × 1010 | 1 | |
2.110105901 × 1010 | 1 | |
2.110105901 × 1010 | 1 | |
2.110105897 × 1010 | 1 | |
2.110105896 × 1010 | 1 |
country
Categorical
HIGH CARDINALITY
IMBALANCE
Distinct | 127 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 548.8 KiB |
United States | |
---|---|
United Kingdom | |
Netherlands | 3481 |
Germany | 2712 |
Switzerland | 1167 |
Other values (122) |
Length
Max length | 22 |
---|---|
Median length | 13 |
Mean length | 11.377476 |
Min length | 4 |
Characters and Unicode
Total characters | 799006 |
---|---|
Distinct characters | 50 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 18 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | United States |
---|---|
2nd row | United States |
3rd row | United States |
4th row | United States |
5th row | United States |
Common Values
Value | Count | Frequency (%) |
United States | 37138 | |
United Kingdom | 8812 | 12.5% |
Netherlands | 3481 | 5.0% |
Germany | 2712 | 3.9% |
Switzerland | 1167 | 1.7% |
China | 1150 | 1.6% |
France | 1136 | 1.6% |
Italy | 1058 | 1.5% |
Spain | 967 | 1.4% |
Japan | 862 | 1.2% |
Other values (117) | 11744 | 16.7% |
Length
Value | Count | Frequency (%) |
united | 46079 | |
states | 37138 | |
kingdom | 8812 | 7.4% |
netherlands | 3481 | 2.9% |
germany | 2712 | 2.3% |
switzerland | 1167 | 1.0% |
china | 1150 | 1.0% |
france | 1136 | 1.0% |
italy | 1058 | 0.9% |
spain | 967 | 0.8% |
Other values (136) | 14679 | 12.4% |
Most occurring characters
Value | Count | Frequency (%) |
t | 129585 | |
e | 101648 | |
n | 73646 | |
i | 66270 | |
a | 65217 | |
d | 63365 | |
48152 | 6.0% | |
U | 46202 | 5.8% |
s | 43443 | 5.4% |
S | 40698 | 5.1% |
Other values (40) | 120780 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 632496 | |
Uppercase Letter | 118358 | 14.8% |
Space Separator | 48152 | 6.0% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
t | 129585 | |
e | 101648 | |
n | 73646 | |
i | 66270 | |
a | 65217 | |
d | 63365 | |
s | 43443 | 6.9% |
r | 14470 | 2.3% |
o | 13760 | 2.2% |
m | 12645 | 2.0% |
Other values (16) | 48447 | 7.7% |
Uppercase Letter
Value | Count | Frequency (%) |
U | 46202 | |
S | 40698 | |
K | 9274 | 7.8% |
N | 3787 | 3.2% |
C | 2931 | 2.5% |
G | 2880 | 2.4% |
I | 2386 | 2.0% |
F | 1932 | 1.6% |
P | 1474 | 1.2% |
R | 1350 | 1.1% |
Other values (13) | 5444 | 4.6% |
Space Separator
Value | Count | Frequency (%) |
48152 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 750854 | |
Common | 48152 | 6.0% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
t | 129585 | |
e | 101648 | |
n | 73646 | |
i | 66270 | |
a | 65217 | |
d | 63365 | |
U | 46202 | 6.2% |
s | 43443 | 5.8% |
S | 40698 | 5.4% |
r | 14470 | 1.9% |
Other values (39) | 106310 |
Common
Value | Count | Frequency (%) |
48152 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 799006 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
t | 129585 | |
e | 101648 | |
n | 73646 | |
i | 66270 | |
a | 65217 | |
d | 63365 | |
48152 | 6.0% | |
U | 46202 | 5.8% |
s | 43443 | 5.4% |
S | 40698 | 5.1% |
Other values (40) | 120780 |
coverage
Categorical
HIGH CARDINALITY
IMBALANCE
Distinct | 10044 |
---|---|
Distinct (%) | 14.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 548.8 KiB |
None | |
---|---|
2008-2021 | 789 |
2009-2021 | 767 |
2010-2021 | 750 |
2018-2021 | 739 |
Other values (10039) |
Length
Max length | 303 |
---|---|
Median length | 213 |
Mean length | 9.1563786 |
Min length | 4 |
Characters and Unicode
Total characters | 643025 |
---|---|
Distinct characters | 17 |
Distinct categories | 6 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 8447 ? |
---|---|
Unique (%) | 12.0% |
Sample
1st row | 1999-2003, 2005, 2008 |
---|---|
2nd row | 1958-2021 |
3rd row | 1969-2021 |
4th row | 2000-2021 |
5th row | 1988-2021 |
Common Values
Value | Count | Frequency (%) |
None | 33355 | |
2008-2021 | 789 | 1.1% |
2009-2021 | 767 | 1.1% |
2010-2021 | 750 | 1.1% |
2018-2021 | 739 | 1.1% |
2019-2021 | 703 | 1.0% |
1996-2021 | 694 | 1.0% |
2011-2021 | 689 | 1.0% |
2020-2021 | 657 | 0.9% |
2017-2021 | 640 | 0.9% |
Other values (10034) | 30444 |
Length
Value | Count | Frequency (%) |
none | 33355 | |
1996-2021 | 1049 | 1.1% |
2008-2021 | 927 | 1.0% |
2009-2021 | 912 | 1.0% |
2010-2021 | 863 | 0.9% |
2020-2021 | 850 | 0.9% |
2018-2021 | 847 | 0.9% |
2019-2021 | 813 | 0.9% |
2011-2021 | 811 | 0.9% |
2017-2021 | 778 | 0.8% |
Other values (2621) | 50924 |
Most occurring characters
Value | Count | Frequency (%) |
2 | 102463 | |
0 | 96451 | |
1 | 89218 | |
9 | 62377 | |
- | 46141 | |
o | 33355 | 5.2% |
N | 33355 | 5.2% |
e | 33355 | 5.2% |
n | 33355 | 5.2% |
, | 21902 | 3.4% |
Other values (7) | 91053 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 419660 | |
Lowercase Letter | 100065 | 15.6% |
Dash Punctuation | 46141 | 7.2% |
Uppercase Letter | 33355 | 5.2% |
Other Punctuation | 21902 | 3.4% |
Space Separator | 21902 | 3.4% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
2 | 102463 | |
0 | 96451 | |
1 | 89218 | |
9 | 62377 | |
8 | 18387 | 4.4% |
7 | 14777 | 3.5% |
6 | 11980 | 2.9% |
5 | 8512 | 2.0% |
4 | 8154 | 1.9% |
3 | 7341 | 1.7% |
Lowercase Letter
Value | Count | Frequency (%) |
o | 33355 | |
e | 33355 | |
n | 33355 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 46141 |
Uppercase Letter
Value | Count | Frequency (%) |
N | 33355 |
Other Punctuation
Value | Count | Frequency (%) |
, | 21902 |
Space Separator
Value | Count | Frequency (%) |
21902 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 509605 | |
Latin | 133420 | 20.7% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
2 | 102463 | |
0 | 96451 | |
1 | 89218 | |
9 | 62377 | |
- | 46141 | |
, | 21902 | 4.3% |
21902 | 4.3% | |
8 | 18387 | 3.6% |
7 | 14777 | 2.9% |
6 | 11980 | 2.4% |
Other values (3) | 24007 | 4.7% |
Latin
Value | Count | Frequency (%) |
o | 33355 | |
N | 33355 | |
e | 33355 | |
n | 33355 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 643025 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
2 | 102463 | |
0 | 96451 | |
1 | 89218 | |
9 | 62377 | |
- | 46141 | |
o | 33355 | 5.2% |
N | 33355 | 5.2% |
e | 33355 | 5.2% |
n | 33355 | 5.2% |
, | 21902 | 3.4% |
Other values (7) | 91053 |
issn
Categorical
Distinct | 37076 |
---|---|
Distinct (%) | 52.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 548.8 KiB |
- | |
---|---|
10716947 | 9 |
15417719 | 7 |
09353224 | 3 |
1038412X | 3 |
Other values (37071) |
Length
Max length | 38 |
---|---|
Median length | 28 |
Mean length | 7.2422573 |
Min length | 1 |
Characters and Unicode
Total characters | 508602 |
---|---|
Distinct characters | 14 |
Distinct categories | 5 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 37009 ? |
---|---|
Unique (%) | 52.7% |
Sample
1st row | 15276228 |
---|---|
2nd row | 00225002, 19383711 |
3rd row | 00225061, 15206696 |
4th row | 15299740, 15299732 |
5th row | 15736598, 08949867 |
Common Values
Value | Count | Frequency (%) |
- | 33072 | |
10716947 | 9 | < 0.1% |
15417719 | 7 | < 0.1% |
09353224 | 3 | < 0.1% |
1038412X | 3 | < 0.1% |
10503862 | 2 | < 0.1% |
15938883, 11296569 | 2 | < 0.1% |
10672478 | 2 | < 0.1% |
07347464 | 2 | < 0.1% |
10928138 | 2 | < 0.1% |
Other values (37066) | 37123 |
Length
Value | Count | Frequency (%) |
33072 | ||
10716947 | 9 | < 0.1% |
15417719 | 7 | < 0.1% |
16608151 | 3 | < 0.1% |
13474065 | 3 | < 0.1% |
00214922 | 3 | < 0.1% |
1038412x | 3 | < 0.1% |
09353224 | 3 | < 0.1% |
0148396x | 2 | < 0.1% |
14320681 | 2 | < 0.1% |
Other values (54838) | 54949 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 63861 | |
0 | 59077 | |
2 | 48805 | |
5 | 40128 | |
3 | 39654 | |
7 | 38534 | |
4 | 38129 | |
9 | 36273 | |
6 | 35512 | |
8 | 34913 | |
Other values (4) | 73716 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 434886 | |
Dash Punctuation | 33072 | 6.5% |
Other Punctuation | 17829 | 3.5% |
Space Separator | 17829 | 3.5% |
Uppercase Letter | 4986 | 1.0% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 63861 | |
0 | 59077 | |
2 | 48805 | |
5 | 40128 | |
3 | 39654 | |
7 | 38534 | |
4 | 38129 | |
9 | 36273 | |
6 | 35512 | |
8 | 34913 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 33072 |
Other Punctuation
Value | Count | Frequency (%) |
, | 17829 |
Space Separator
Value | Count | Frequency (%) |
17829 |
Uppercase Letter
Value | Count | Frequency (%) |
X | 4986 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 503616 | |
Latin | 4986 | 1.0% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 63861 | |
0 | 59077 | |
2 | 48805 | |
5 | 40128 | |
3 | 39654 | |
7 | 38534 | |
4 | 38129 | |
9 | 36273 | |
6 | 35512 | |
8 | 34913 | |
Other values (3) | 68730 |
Latin
Value | Count | Frequency (%) |
X | 4986 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 508602 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 63861 | |
0 | 59077 | |
2 | 48805 | |
5 | 40128 | |
3 | 39654 | |
7 | 38534 | |
4 | 38129 | |
9 | 36273 | |
6 | 35512 | |
8 | 34913 | |
Other values (4) | 73716 |
publisher
Categorical
HIGH CARDINALITY
IMBALANCE
Distinct | 11095 |
---|---|
Distinct (%) | 15.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 548.8 KiB |
None | |
---|---|
Taylor and Francis Ltd. | 1483 |
Elsevier BV | 771 |
Routledge | 731 |
Elsevier | 638 |
Other values (11090) |
Length
Max length | 158 |
---|---|
Median length | 144 |
Mean length | 15.862574 |
Min length | 3 |
Characters and Unicode
Total characters | 1113981 |
---|---|
Distinct characters | 77 |
Distinct categories | 10 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 8493 ? |
---|---|
Unique (%) | 12.1% |
Sample
1st row | Columbus State University |
---|---|
2nd row | Wiley-Blackwell |
3rd row | John Wiley & Sons Inc. |
4th row | Routledge |
5th row | Wiley-Blackwell |
Common Values
Value | Count | Frequency (%) |
None | 33944 | |
Taylor and Francis Ltd. | 1483 | 2.1% |
Elsevier BV | 771 | 1.1% |
Routledge | 731 | 1.0% |
Elsevier | 638 | 0.9% |
Wiley-Blackwell Publishing Ltd | 638 | 0.9% |
SAGE Publications Inc. | 516 | 0.7% |
Springer Verlag | 511 | 0.7% |
Elsevier Ltd. | 473 | 0.7% |
Emerald Group Publishing Ltd. | 447 | 0.6% |
Other values (11085) | 30075 |
Length
Value | Count | Frequency (%) |
none | 33944 | 20.6% |
ltd | 6130 | 3.7% |
of | 5689 | 3.5% |
and | 4038 | 2.4% |
university | 3672 | 2.2% |
publishing | 3496 | 2.1% |
inc | 2902 | 1.8% |
press | 2755 | 1.7% |
elsevier | 2522 | 1.5% |
de | 2212 | 1.3% |
Other values (10581) | 97505 |
Most occurring characters
Value | Count | Frequency (%) |
e | 115032 | 10.3% |
n | 97400 | 8.7% |
94648 | 8.5% | |
o | 82933 | 7.4% |
i | 82401 | 7.4% |
a | 59831 | 5.4% |
r | 53928 | 4.8% |
s | 50889 | 4.6% |
t | 48634 | 4.4% |
l | 43017 | 3.9% |
Other values (67) | 385268 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 836450 | |
Uppercase Letter | 164894 | 14.8% |
Space Separator | 94648 | 8.5% |
Other Punctuation | 13912 | 1.2% |
Dash Punctuation | 2155 | 0.2% |
Open Punctuation | 853 | 0.1% |
Close Punctuation | 849 | 0.1% |
Math Symbol | 132 | < 0.1% |
Decimal Number | 87 | < 0.1% |
Connector Punctuation | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 115032 | |
n | 97400 | |
o | 82933 | |
i | 82401 | |
a | 59831 | 7.2% |
r | 53928 | 6.4% |
s | 50889 | 6.1% |
t | 48634 | 5.8% |
l | 43017 | 5.1% |
c | 36358 | 4.3% |
Other values (16) | 166027 |
Uppercase Letter
Value | Count | Frequency (%) |
N | 37339 | |
S | 13843 | 8.4% |
P | 13758 | 8.3% |
A | 9717 | 5.9% |
E | 9594 | 5.8% |
I | 9494 | 5.8% |
L | 8824 | 5.4% |
C | 7841 | 4.8% |
M | 6371 | 3.9% |
B | 6000 | 3.6% |
Other values (16) | 42113 |
Other Punctuation
Value | Count | Frequency (%) |
. | 10725 | |
, | 1419 | 10.2% |
; | 537 | 3.9% |
& | 535 | 3.8% |
' | 460 | 3.3% |
/ | 168 | 1.2% |
" | 45 | 0.3% |
: | 22 | 0.2% |
* | 1 | < 0.1% |
Decimal Number
Value | Count | Frequency (%) |
8 | 29 | |
1 | 26 | |
5 | 12 | |
0 | 7 | 8.0% |
3 | 5 | 5.7% |
4 | 5 | 5.7% |
2 | 2 | 2.3% |
9 | 1 | 1.1% |
Open Punctuation
Value | Count | Frequency (%) |
( | 852 | |
[ | 1 | 0.1% |
Close Punctuation
Value | Count | Frequency (%) |
) | 848 | |
] | 1 | 0.1% |
Space Separator
Value | Count | Frequency (%) |
94648 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 2155 |
Math Symbol
Value | Count | Frequency (%) |
+ | 132 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 1001344 | |
Common | 112637 | 10.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 115032 | 11.5% |
n | 97400 | 9.7% |
o | 82933 | 8.3% |
i | 82401 | 8.2% |
a | 59831 | 6.0% |
r | 53928 | 5.4% |
s | 50889 | 5.1% |
t | 48634 | 4.9% |
l | 43017 | 4.3% |
N | 37339 | 3.7% |
Other values (42) | 329940 |
Common
Value | Count | Frequency (%) |
94648 | ||
. | 10725 | 9.5% |
- | 2155 | 1.9% |
, | 1419 | 1.3% |
( | 852 | 0.8% |
) | 848 | 0.8% |
; | 537 | 0.5% |
& | 535 | 0.5% |
' | 460 | 0.4% |
/ | 168 | 0.1% |
Other values (15) | 290 | 0.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1113981 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 115032 | 10.3% |
n | 97400 | 8.7% |
94648 | 8.5% | |
o | 82933 | 7.4% |
i | 82401 | 7.4% |
a | 59831 | 5.4% |
r | 53928 | 4.8% |
s | 50889 | 4.6% |
t | 48634 | 4.4% |
l | 43017 | 3.9% |
Other values (67) | 385268 |
region
Categorical
Distinct | 9 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 548.8 KiB |
Northern America | |
---|---|
Western Europe | |
Asiatic Region | |
Eastern Europe | 3397 |
Latin America | 1176 |
Other values (4) | 2064 |
Length
Max length | 18 |
---|---|
Median length | 16 |
Mean length | 15.006166 |
Min length | 6 |
Characters and Unicode
Total characters | 1053838 |
---|---|
Distinct characters | 27 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Northern America |
---|---|
2nd row | Northern America |
3rd row | Northern America |
4th row | Northern America |
5th row | Northern America |
Common Values
Value | Count | Frequency (%) |
Northern America | 37936 | |
Western Europe | 21255 | |
Asiatic Region | 4399 | 6.3% |
Eastern Europe | 3397 | 4.8% |
Latin America | 1176 | 1.7% |
Middle East | 892 | 1.3% |
Pacific Region | 792 | 1.1% |
Africa | 240 | 0.3% |
Africa/Middle East | 140 | 0.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
america | 39112 | |
northern | 37936 | |
europe | 24652 | |
western | 21255 | |
region | 5191 | 3.7% |
asiatic | 4399 | 3.1% |
eastern | 3397 | 2.4% |
latin | 1176 | 0.8% |
east | 1032 | 0.7% |
middle | 892 | 0.6% |
Other values (3) | 1172 | 0.8% |
Most occurring characters
Value | Count | Frequency (%) |
r | 164668 | |
e | 153830 | |
69987 | 6.6% | |
t | 69195 | 6.6% |
n | 68955 | 6.5% |
o | 67779 | 6.4% |
i | 57273 | 5.4% |
a | 50288 | 4.8% |
c | 45475 | 4.3% |
A | 43891 | 4.2% |
Other values (17) | 262497 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 843357 | |
Uppercase Letter | 140354 | 13.3% |
Space Separator | 69987 | 6.6% |
Other Punctuation | 140 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
r | 164668 | |
e | 153830 | |
t | 69195 | |
n | 68955 | |
o | 67779 | |
i | 57273 | 6.8% |
a | 50288 | 6.0% |
c | 45475 | 5.4% |
m | 39112 | 4.6% |
h | 37936 | 4.5% |
Other values (7) | 88846 |
Uppercase Letter
Value | Count | Frequency (%) |
A | 43891 | |
N | 37936 | |
E | 29081 | |
W | 21255 | |
R | 5191 | 3.7% |
L | 1176 | 0.8% |
M | 1032 | 0.7% |
P | 792 | 0.6% |
Space Separator
Value | Count | Frequency (%) |
69987 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 140 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 983711 | |
Common | 70127 | 6.7% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
r | 164668 | |
e | 153830 | |
t | 69195 | 7.0% |
n | 68955 | 7.0% |
o | 67779 | 6.9% |
i | 57273 | 5.8% |
a | 50288 | 5.1% |
c | 45475 | 4.6% |
A | 43891 | 4.5% |
m | 39112 | 4.0% |
Other values (15) | 223245 |
Common
Value | Count | Frequency (%) |
69987 | ||
/ | 140 | 0.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1053838 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
r | 164668 | |
e | 153830 | |
69987 | 6.6% | |
t | 69195 | 6.6% |
n | 68955 | 6.5% |
o | 67779 | 6.4% |
i | 57273 | 5.4% |
a | 50288 | 4.8% |
c | 45475 | 4.3% |
A | 43891 | 4.2% |
Other values (17) | 262497 |
title
Categorical
HIGH CARDINALITY
UNIFORM
Distinct | 68293 |
---|---|
Distinct (%) | 97.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 548.8 KiB |
Optics InfoBase Conference Papers | 51 |
---|---|
22nd International Congress on Sound and Vibration, ICSV 2015 | 11 |
41st EPS Conference on Plasma Physics, EPS 2014 | 10 |
Proceedings of the ASME Turbo Expo | 9 |
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics | 7 |
Other values (68288) |
Length
Max length | 444 |
---|---|
Median length | 248 |
Mean length | 62.681775 |
Min length | 2 |
Characters and Unicode
Total characters | 4401953 |
---|---|
Distinct characters | 87 |
Distinct categories | 10 ? |
Distinct scripts | 2 ? |
Distinct blocks | 2 ? |
Unique
Unique | 66867 ? |
---|---|
Unique (%) | 95.2% |
Sample
1st row | Journal of Technology in Counseling |
---|---|
2nd row | Journal of the Experimental Analysis of Behavior |
3rd row | Journal of the History of the Behavioral Sciences |
4th row | Journal of Trauma and Dissociation |
5th row | Journal of Traumatic Stress |
Common Values
Value | Count | Frequency (%) |
Optics InfoBase Conference Papers | 51 | 0.1% |
22nd International Congress on Sound and Vibration, ICSV 2015 | 11 | < 0.1% |
41st EPS Conference on Plasma Physics, EPS 2014 | 10 | < 0.1% |
Proceedings of the ASME Turbo Expo | 9 | < 0.1% |
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics | 7 | < 0.1% |
National Radio Science Conference, NRSC, Proceedings | 7 | < 0.1% |
2014 13th International Conference on Control Automation Robotics and Vision, ICARCV 2014 | 7 | < 0.1% |
Proceedings of the Annual International Conference on Mobile Computing and Networking, MOBICOM | 7 | < 0.1% |
DOLAP: Proceedings of the ACM International Workshop on Data Warehousing and OLAP | 6 | < 0.1% |
Proceedings of the Electronic Packaging Technology Conference, EPTC | 6 | < 0.1% |
Other values (68283) | 70106 |
Length
Value | Count | Frequency (%) |
and | 31436 | 5.4% |
of | 27668 | 4.7% |
on | 23799 | 4.1% |
international | 22208 | 3.8% |
conference | 21554 | 3.7% |
18233 | 3.1% | |
proceedings | 18050 | 3.1% |
the | 14273 | 2.4% |
journal | 10937 | 1.9% |
in | 7122 | 1.2% |
Other values (31286) | 391802 |
Most occurring characters
Value | Count | Frequency (%) |
516927 | 11.7% | |
n | 388063 | 8.8% |
e | 365186 | 8.3% |
o | 304133 | 6.9% |
i | 260449 | 5.9% |
a | 241835 | 5.5% |
t | 223534 | 5.1% |
r | 210511 | 4.8% |
s | 148099 | 3.4% |
c | 147729 | 3.4% |
Other values (77) | 1595487 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 3044695 | |
Uppercase Letter | 556104 | 12.6% |
Space Separator | 516927 | 11.7% |
Decimal Number | 208172 | 4.7% |
Other Punctuation | 47795 | 1.1% |
Dash Punctuation | 24581 | 0.6% |
Open Punctuation | 1757 | < 0.1% |
Close Punctuation | 1752 | < 0.1% |
Math Symbol | 152 | < 0.1% |
Connector Punctuation | 18 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
n | 388063 | |
e | 365186 | |
o | 304133 | |
i | 260449 | |
a | 241835 | 7.9% |
t | 223534 | 7.3% |
r | 210511 | 6.9% |
s | 148099 | 4.9% |
c | 147729 | 4.9% |
l | 129958 | 4.3% |
Other values (16) | 625198 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 74882 | |
I | 67952 | |
S | 60891 | |
E | 59521 | |
P | 46901 | |
A | 43460 | 7.8% |
M | 32939 | 5.9% |
T | 24405 | 4.4% |
R | 18317 | 3.3% |
D | 15284 | 2.7% |
Other values (16) | 111552 |
Other Punctuation
Value | Count | Frequency (%) |
, | 33321 | |
: | 5506 | 11.5% |
' | 2817 | 5.9% |
/ | 2603 | 5.4% |
. | 1865 | 3.9% |
; | 761 | 1.6% |
& | 382 | 0.8% |
" | 268 | 0.6% |
# | 176 | 0.4% |
? | 30 | 0.1% |
Other values (4) | 66 | 0.1% |
Decimal Number
Value | Count | Frequency (%) |
0 | 60981 | |
2 | 52105 | |
1 | 45042 | |
5 | 7477 | 3.6% |
9 | 7409 | 3.6% |
4 | 7390 | 3.5% |
8 | 7300 | 3.5% |
3 | 7183 | 3.5% |
6 | 6989 | 3.4% |
7 | 6296 | 3.0% |
Math Symbol
Value | Count | Frequency (%) |
= | 103 | |
+ | 48 | |
| | 1 | 0.7% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 24580 | |
– | 1 | < 0.1% |
Open Punctuation
Value | Count | Frequency (%) |
( | 1735 | |
[ | 22 | 1.3% |
Close Punctuation
Value | Count | Frequency (%) |
) | 1730 | |
] | 22 | 1.3% |
Space Separator
Value | Count | Frequency (%) |
516927 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 18 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 3600799 | |
Common | 801154 | 18.2% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
n | 388063 | 10.8% |
e | 365186 | 10.1% |
o | 304133 | 8.4% |
i | 260449 | 7.2% |
a | 241835 | 6.7% |
t | 223534 | 6.2% |
r | 210511 | 5.8% |
s | 148099 | 4.1% |
c | 147729 | 4.1% |
l | 129958 | 3.6% |
Other values (42) | 1181302 |
Common
Value | Count | Frequency (%) |
516927 | ||
0 | 60981 | 7.6% |
2 | 52105 | 6.5% |
1 | 45042 | 5.6% |
, | 33321 | 4.2% |
- | 24580 | 3.1% |
5 | 7477 | 0.9% |
9 | 7409 | 0.9% |
4 | 7390 | 0.9% |
8 | 7300 | 0.9% |
Other values (25) | 38622 | 4.8% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 4401952 | |
Punctuation | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
516927 | 11.7% | |
n | 388063 | 8.8% |
e | 365186 | 8.3% |
o | 304133 | 6.9% |
i | 260449 | 5.9% |
a | 241835 | 5.5% |
t | 223534 | 5.1% |
r | 210511 | 4.8% |
s | 148099 | 3.4% |
c | 147729 | 3.4% |
Other values (76) | 1595486 |
Punctuation
Value | Count | Frequency (%) |
– | 1 |
type
Categorical
Distinct | 4 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 548.8 KiB |
journal | |
---|---|
conference and proceedings | |
book series | 1462 |
trade journal | 789 |
Common Values
Value | Count | Frequency (%) |
journal | 34331 | |
conference and proceedings | 33645 | |
book series | 1462 | 2.1% |
trade journal | 789 | 1.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
journal | 35120 | |
conference | 33645 | |
and | 33645 | |
proceedings | 33645 | |
book | 1462 | 1.0% |
series | 1462 | 1.0% |
trade | 789 | 0.6% |
Most occurring characters
Value | Count | Frequency (%) |
e | 171938 | |
n | 169700 | |
o | 105334 | |
r | 104661 | |
c | 100935 | |
a | 69554 | 6.1% |
69541 | 6.1% | |
d | 68079 | 6.0% |
s | 36569 | 3.2% |
j | 35120 | 3.1% |
Other values (9) | 209995 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 1071885 | |
Space Separator | 69541 | 6.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 171938 | |
n | 169700 | |
o | 105334 | |
r | 104661 | |
c | 100935 | |
a | 69554 | 6.5% |
d | 68079 | 6.4% |
s | 36569 | 3.4% |
j | 35120 | 3.3% |
l | 35120 | 3.3% |
Other values (8) | 174875 |
Space Separator
Value | Count | Frequency (%) |
69541 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 1071885 | |
Common | 69541 | 6.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 171938 | |
n | 169700 | |
o | 105334 | |
r | 104661 | |
c | 100935 | |
a | 69554 | 6.5% |
d | 68079 | 6.4% |
s | 36569 | 3.4% |
j | 35120 | 3.3% |
l | 35120 | 3.3% |
Other values (8) | 174875 |
Common
Value | Count | Frequency (%) |
69541 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1141426 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 171938 | |
n | 169700 | |
o | 105334 | |
r | 104661 | |
c | 100935 | |
a | 69554 | 6.1% |
69541 | 6.1% | |
d | 68079 | 6.0% |
s | 36569 | 3.2% |
j | 35120 | 3.1% |
Other values (9) | 209995 |
sourceid | region | type | |
---|---|---|---|
sourceid | 1.000 | 0.090 | 0.315 |
region | 0.090 | 1.000 | 0.337 |
type | 0.315 | 0.337 | 1.000 |
sourceid | country | coverage | issn | publisher | region | title | type | |
---|---|---|---|---|---|---|---|---|
0 | 12000 | United States | 1999-2003, 2005, 2008 | 15276228 | Columbus State University | Northern America | Journal of Technology in Counseling | journal |
1 | 12001 | United States | 1958-2021 | 00225002, 19383711 | Wiley-Blackwell | Northern America | Journal of the Experimental Analysis of Behavior | journal |
2 | 12002 | United States | 1969-2021 | 00225061, 15206696 | John Wiley & Sons Inc. | Northern America | Journal of the History of the Behavioral Sciences | journal |
3 | 12004 | United States | 2000-2021 | 15299740, 15299732 | Routledge | Northern America | Journal of Trauma and Dissociation | journal |
4 | 12005 | United States | 1988-2021 | 15736598, 08949867 | Wiley-Blackwell | Northern America | Journal of Traumatic Stress | journal |
5 | 12006 | United States | 1971-2021 | 10959084, 00018791 | Academic Press Inc. | Northern America | Journal of Vocational Behavior | journal |
6 | 12007 | Hungary | 1946, 1948, 1977-1999 | 00390690 | Kozponti Statisztikai Hivatal | Eastern Europe | Statisztikai Szemle | journal |
7 | 12008 | Hungary | 1980, 1982-1983, 1985, 2016-2021 | 20648251, 00187828 | Hungarian Central Statistical Office | Eastern Europe | Teruleti Statisztika | journal |
8 | 12009 | Germany | 2000-2018 | 09426051 | J.C. Cotta'sche Buchhandlung Nachvolger GmbH | Western Europe | Kinderanalyse (discontinued) | journal |
9 | 12010 | United States | 1950-1958, 1960-1963, 1965-2021 | 00664308, 15452085 | Annual Reviews Inc. | Northern America | Annual Review of Psychology | journal |
sourceid | country | coverage | issn | publisher | region | title | type | |
---|---|---|---|---|---|---|---|---|
70217 | 21101058963 | United States | 2016-2021 | 20597991 | SAGE Publications Inc. | Northern America | Methodological Innovations | journal |
70218 | 21101058966 | Denmark | 2021 | 22468498 | Aalborg University Press | Western Europe | Journal of Somaesthetics | journal |
70219 | 21101059010 | Netherlands | 2021 | 25424246, 25424238 | Brill Academic Publishers | Western Europe | International Journal of Asian Christianity | journal |
70220 | 21101059012 | Germany | 2021 | 25693263 | Walter de Gruyter GmbH | Western Europe | Chemistry Teacher International | journal |
70221 | 21101059299 | United States | 2014-2021 | 23482451, 23220058 | SAGE Publications Inc. | Northern America | Asian Journal of Legal Education | journal |
70222 | 21101059300 | Ukraine | 2021 | 20753829, 20753810 | V. N. Karazin Kharkiv National University | Eastern Europe | Biophysical Bulletin | journal |
70223 | 21101059489 | United States | 2021 | 25735985 | EnPress Publisher, LLC | Northern America | Trends in Immunotherapy | journal |
70224 | 21101059784 | China | 2020-2021 | 20961146 | Chinese Academy of Sciences | Asiatic Region | Journal of Cyber Security | journal |
70225 | 21101059785 | Thailand | 2015-2021 | 24523151 | Kasetsart University Research and Development Institute | Asiatic Region | Kasetsart Journal of Social Sciences | journal |
70226 | 21101059786 | United States | 2019-2020 | 15297470, 15336239 | Brookings Institution Press | Northern America | Economia | journal |