Dataset statistics
| Number of variables | 5 |
|---|---|
| Number of observations | 150 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 1 |
| Duplicate rows (%) | 0.7% |
| Total size in memory | 6.0 KiB |
| Average record size in memory | 40.9 B |
Variable types
| Numeric | 4 |
|---|---|
| Categorical | 1 |
| Dataset has 1 (0.7%) duplicate rows | Duplicates |
petal_length is highly overall correlated with petal_width and 2 other fields | High correlation |
petal_width is highly overall correlated with petal_length and 2 other fields | High correlation |
sepal_length is highly overall correlated with petal_length and 2 other fields | High correlation |
species is highly overall correlated with petal_length and 2 other fields | High correlation |
species is uniformly distributed | Uniform |
Reproduction
| Analysis started | 2024-07-14 05:42:27.904794 |
|---|---|
| Analysis finished | 2024-07-14 05:42:29.649817 |
| Duration | 1.75 second |
| Software version | ydata-profiling v4.8.3 |
| Download configuration | config.json |
sepal_length
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 35 |
|---|---|
| Distinct (%) | 23.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.8433333 |
| Minimum | 4.3 |
|---|---|
| Maximum | 7.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.3 KiB |
Quantile statistics
| Minimum | 4.3 |
|---|---|
| 5-th percentile | 4.6 |
| Q1 | 5.1 |
| median | 5.8 |
| Q3 | 6.4 |
| 95-th percentile | 7.255 |
| Maximum | 7.9 |
| Range | 3.6 |
| Interquartile range (IQR) | 1.3 |
Descriptive statistics
| Standard deviation | 0.82806613 |
|---|---|
| Coefficient of variation (CV) | 0.14171126 |
| Kurtosis | -0.55206404 |
| Mean | 5.8433333 |
| Median Absolute Deviation (MAD) | 0.7 |
| Skewness | 0.31491096 |
| Sum | 876.5 |
| Variance | 0.68569351 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=35)
| Value | Count | Frequency (%) |
| 5 | 10 | 6.7% |
| 5.1 | 9 | 6.0% |
| 6.3 | 9 | 6.0% |
| 5.7 | 8 | 5.3% |
| 6.7 | 8 | 5.3% |
| 5.8 | 7 | 4.7% |
| 5.5 | 7 | 4.7% |
| 6.4 | 7 | 4.7% |
| 4.9 | 6 | 4.0% |
| 5.4 | 6 | 4.0% |
| Other values (25) | 73 |
| Value | Count | Frequency (%) |
| 4.3 | 1 | 0.7% |
| 4.4 | 3 | 2.0% |
| 4.5 | 1 | 0.7% |
| 4.6 | 4 | 2.7% |
| 4.7 | 2 | 1.3% |
| 4.8 | 5 | |
| 4.9 | 6 | |
| 5 | 10 | |
| 5.1 | 9 | |
| 5.2 | 4 | 2.7% |
| Value | Count | Frequency (%) |
| 7.9 | 1 | 0.7% |
| 7.7 | 4 | |
| 7.6 | 1 | 0.7% |
| 7.4 | 1 | 0.7% |
| 7.3 | 1 | 0.7% |
| 7.2 | 3 | |
| 7.1 | 1 | 0.7% |
| 7 | 1 | 0.7% |
| 6.9 | 4 | |
| 6.8 | 3 |
sepal_width
Real number (ℝ)
| Distinct | 23 |
|---|---|
| Distinct (%) | 15.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.0573333 |
| Minimum | 2 |
|---|---|
| Maximum | 4.4 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.3 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 2.345 |
| Q1 | 2.8 |
| median | 3 |
| Q3 | 3.3 |
| 95-th percentile | 3.8 |
| Maximum | 4.4 |
| Range | 2.4 |
| Interquartile range (IQR) | 0.5 |
Descriptive statistics
| Standard deviation | 0.43586628 |
|---|---|
| Coefficient of variation (CV) | 0.1425642 |
| Kurtosis | 0.22824904 |
| Mean | 3.0573333 |
| Median Absolute Deviation (MAD) | 0.3 |
| Skewness | 0.31896566 |
| Sum | 458.6 |
| Variance | 0.18997942 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=23)
| Value | Count | Frequency (%) |
| 3 | 26 | |
| 2.8 | 14 | 9.3% |
| 3.2 | 13 | 8.7% |
| 3.4 | 12 | 8.0% |
| 3.1 | 11 | 7.3% |
| 2.9 | 10 | 6.7% |
| 2.7 | 9 | 6.0% |
| 2.5 | 8 | 5.3% |
| 3.5 | 6 | 4.0% |
| 3.3 | 6 | 4.0% |
| Other values (13) | 35 |
| Value | Count | Frequency (%) |
| 2 | 1 | 0.7% |
| 2.2 | 3 | 2.0% |
| 2.3 | 4 | 2.7% |
| 2.4 | 3 | 2.0% |
| 2.5 | 8 | 5.3% |
| 2.6 | 5 | 3.3% |
| 2.7 | 9 | 6.0% |
| 2.8 | 14 | |
| 2.9 | 10 | 6.7% |
| 3 | 26 |
| Value | Count | Frequency (%) |
| 4.4 | 1 | 0.7% |
| 4.2 | 1 | 0.7% |
| 4.1 | 1 | 0.7% |
| 4 | 1 | 0.7% |
| 3.9 | 2 | 1.3% |
| 3.8 | 6 | |
| 3.7 | 3 | 2.0% |
| 3.6 | 4 | 2.7% |
| 3.5 | 6 | |
| 3.4 | 12 |
petal_length
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 43 |
|---|---|
| Distinct (%) | 28.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.758 |
| Minimum | 1 |
|---|---|
| Maximum | 6.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.3 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1.3 |
| Q1 | 1.6 |
| median | 4.35 |
| Q3 | 5.1 |
| 95-th percentile | 6.1 |
| Maximum | 6.9 |
| Range | 5.9 |
| Interquartile range (IQR) | 3.5 |
Descriptive statistics
| Standard deviation | 1.7652982 |
|---|---|
| Coefficient of variation (CV) | 0.46974407 |
| Kurtosis | -1.4021034 |
| Mean | 3.758 |
| Median Absolute Deviation (MAD) | 1.25 |
| Skewness | -0.27488418 |
| Sum | 563.7 |
| Variance | 3.1162779 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=43)
| Value | Count | Frequency (%) |
| 1.4 | 13 | 8.7% |
| 1.5 | 13 | 8.7% |
| 5.1 | 8 | 5.3% |
| 4.5 | 8 | 5.3% |
| 1.6 | 7 | 4.7% |
| 1.3 | 7 | 4.7% |
| 5.6 | 6 | 4.0% |
| 4.7 | 5 | 3.3% |
| 4.9 | 5 | 3.3% |
| 4 | 5 | 3.3% |
| Other values (33) | 73 |
| Value | Count | Frequency (%) |
| 1 | 1 | 0.7% |
| 1.1 | 1 | 0.7% |
| 1.2 | 2 | 1.3% |
| 1.3 | 7 | |
| 1.4 | 13 | |
| 1.5 | 13 | |
| 1.6 | 7 | |
| 1.7 | 4 | 2.7% |
| 1.9 | 2 | 1.3% |
| 3 | 1 | 0.7% |
| Value | Count | Frequency (%) |
| 6.9 | 1 | 0.7% |
| 6.7 | 2 | |
| 6.6 | 1 | 0.7% |
| 6.4 | 1 | 0.7% |
| 6.3 | 1 | 0.7% |
| 6.1 | 3 | |
| 6 | 2 | |
| 5.9 | 2 | |
| 5.8 | 3 | |
| 5.7 | 3 |
petal_width
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 22 |
|---|---|
| Distinct (%) | 14.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.1993333 |
| Minimum | 0.1 |
|---|---|
| Maximum | 2.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.3 KiB |
Quantile statistics
| Minimum | 0.1 |
|---|---|
| 5-th percentile | 0.2 |
| Q1 | 0.3 |
| median | 1.3 |
| Q3 | 1.8 |
| 95-th percentile | 2.3 |
| Maximum | 2.5 |
| Range | 2.4 |
| Interquartile range (IQR) | 1.5 |
Descriptive statistics
| Standard deviation | 0.76223767 |
|---|---|
| Coefficient of variation (CV) | 0.63555114 |
| Kurtosis | -1.340604 |
| Mean | 1.1993333 |
| Median Absolute Deviation (MAD) | 0.7 |
| Skewness | -0.10296675 |
| Sum | 179.9 |
| Variance | 0.58100626 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=22)
| Value | Count | Frequency (%) |
| 0.2 | 29 | |
| 1.3 | 13 | 8.7% |
| 1.8 | 12 | 8.0% |
| 1.5 | 12 | 8.0% |
| 1.4 | 8 | 5.3% |
| 2.3 | 8 | 5.3% |
| 1 | 7 | 4.7% |
| 0.4 | 7 | 4.7% |
| 0.3 | 7 | 4.7% |
| 2.1 | 6 | 4.0% |
| Other values (12) | 41 |
| Value | Count | Frequency (%) |
| 0.1 | 5 | 3.3% |
| 0.2 | 29 | |
| 0.3 | 7 | 4.7% |
| 0.4 | 7 | 4.7% |
| 0.5 | 1 | 0.7% |
| 0.6 | 1 | 0.7% |
| 1 | 7 | 4.7% |
| 1.1 | 3 | 2.0% |
| 1.2 | 5 | 3.3% |
| 1.3 | 13 |
| Value | Count | Frequency (%) |
| 2.5 | 3 | 2.0% |
| 2.4 | 3 | 2.0% |
| 2.3 | 8 | |
| 2.2 | 3 | 2.0% |
| 2.1 | 6 | |
| 2 | 6 | |
| 1.9 | 5 | |
| 1.8 | 12 | |
| 1.7 | 2 | 1.3% |
| 1.6 | 4 | 2.7% |
species
Categorical
HIGH CORRELATION  UNIFORM 
| Distinct | 3 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 KiB |
| setosa | |
|---|---|
| versicolor | |
| virginica |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.3333333 |
| Min length | 6 |
Characters and Unicode
| Total characters | 1250 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | setosa |
|---|---|
| 2nd row | setosa |
| 3rd row | setosa |
| 4th row | setosa |
| 5th row | setosa |
Common Values
| Value | Count | Frequency (%) |
| setosa | 50 | |
| versicolor | 50 | |
| virginica | 50 |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| setosa | 50 | |
| versicolor | 50 | |
| virginica | 50 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 200 | |
| s | 150 | |
| o | 150 | |
| r | 150 | |
| e | 100 | |
| a | 100 | |
| v | 100 | |
| c | 100 | |
| t | 50 | 4.0% |
| l | 50 | 4.0% |
| Other values (2) | 100 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1250 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 200 | |
| s | 150 | |
| o | 150 | |
| r | 150 | |
| e | 100 | |
| a | 100 | |
| v | 100 | |
| c | 100 | |
| t | 50 | 4.0% |
| l | 50 | 4.0% |
| Other values (2) | 100 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1250 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 200 | |
| s | 150 | |
| o | 150 | |
| r | 150 | |
| e | 100 | |
| a | 100 | |
| v | 100 | |
| c | 100 | |
| t | 50 | 4.0% |
| l | 50 | 4.0% |
| Other values (2) | 100 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1250 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 200 | |
| s | 150 | |
| o | 150 | |
| r | 150 | |
| e | 100 | |
| a | 100 | |
| v | 100 | |
| c | 100 | |
| t | 50 | 4.0% |
| l | 50 | 4.0% |
| Other values (2) | 100 |
| petal_length | petal_width | sepal_length | sepal_width | species | |
|---|---|---|---|---|---|
| petal_length | 1.000 | 0.938 | 0.882 | -0.310 | 0.890 |
| petal_width | 0.938 | 1.000 | 0.834 | -0.289 | 0.924 |
| sepal_length | 0.882 | 0.834 | 1.000 | -0.167 | 0.617 |
| sepal_width | -0.310 | -0.289 | -0.167 | 1.000 | 0.446 |
| species | 0.890 | 0.924 | 0.617 | 0.446 | 1.000 |
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
| sepal_length | sepal_width | petal_length | petal_width | species | |
|---|---|---|---|---|---|
| 0 | 5.1 | 3.5 | 1.4 | 0.2 | setosa |
| 1 | 4.9 | 3.0 | 1.4 | 0.2 | setosa |
| 2 | 4.7 | 3.2 | 1.3 | 0.2 | setosa |
| 3 | 4.6 | 3.1 | 1.5 | 0.2 | setosa |
| 4 | 5.0 | 3.6 | 1.4 | 0.2 | setosa |
| 5 | 5.4 | 3.9 | 1.7 | 0.4 | setosa |
| 6 | 4.6 | 3.4 | 1.4 | 0.3 | setosa |
| 7 | 5.0 | 3.4 | 1.5 | 0.2 | setosa |
| 8 | 4.4 | 2.9 | 1.4 | 0.2 | setosa |
| 9 | 4.9 | 3.1 | 1.5 | 0.1 | setosa |
| sepal_length | sepal_width | petal_length | petal_width | species | |
|---|---|---|---|---|---|
| 140 | 6.7 | 3.1 | 5.6 | 2.4 | virginica |
| 141 | 6.9 | 3.1 | 5.1 | 2.3 | virginica |
| 142 | 5.8 | 2.7 | 5.1 | 1.9 | virginica |
| 143 | 6.8 | 3.2 | 5.9 | 2.3 | virginica |
| 144 | 6.7 | 3.3 | 5.7 | 2.5 | virginica |
| 145 | 6.7 | 3.0 | 5.2 | 2.3 | virginica |
| 146 | 6.3 | 2.5 | 5.0 | 1.9 | virginica |
| 147 | 6.5 | 3.0 | 5.2 | 2.0 | virginica |
| 148 | 6.2 | 3.4 | 5.4 | 2.3 | virginica |
| 149 | 5.9 | 3.0 | 5.1 | 1.8 | virginica |
Most frequently occurring
| sepal_length | sepal_width | petal_length | petal_width | species | # duplicates | |
|---|---|---|---|---|---|---|
| 0 | 5.8 | 2.7 | 5.1 | 1.9 | virginica | 2 |