The descriptive function in TFSA


The new tool in TFSA version 1.1 is the descriptive function. This is designed to explore data with less R programming and can be completed within a short period of time.  However, the function assumes that the data is `continuous’, not `discrete’. The function can do three important things. They are:-

1. The function returns the following values in the form of a data frame:

  • Total number of observations
  • Mean
  • Median
  • Minimum
  • Maximum
  • Variance
  • Standard deviation
  • Range

2. The function can handle categorical data, say if the observations are measured from different locations or classified based on sex.

3. The descriptive information can be visualized in the form of histogram, box plot or dot plot.

Some examples are provided below using the fishgrp data which comes along with the TFSA package (If you want to use your own data, import them into R- How to do this? Click). First tell R to load the TFSA package.

How to install R? (Check my old post)

library(TFSA)

Now load the fishgrp data from TFSA

data(fishgrp)

Now view the structure of fishgrp data

str(fishgrp)

'data.frame':   506 obs. of  3 variables:
 $ Stock: Factor w/ 4 levels "Calcutta","Gujarat",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ Sex  : Factor w/ 2 levels "F","M": 2 2 1 1 1 1 1 2 1 2 ...
 $ SL   : int  228 264 244 209 220 202 242 256 249 170 ...

This simply tells that the data contains two attributed or factors (Stock and Sex) and one column of numerical observation (SL- Standard length). To visualize the whole data, please type in fishgrp and press the ENTER key. Now, we will use the descriptive function in TFSA to see the descriptive statistics of Standard length (SL).

descriptive(fishgrp$SL)

    N     Mean Median Minimum Maximum Variance  Std dev Range
1 506 202.0435    193     127     299 926.9684 30.44616   172

The `$’ symbol is used in R to choose the variable (SL) from data (fishgrp). The results shows lot of information i.e., the total number of observations are 506, the mean is 202.04 etc… For more details on results, please type in ?descriptive and press the ENTER key.

To visualize the histogram, use the argument HISTOGRAM=TRUE

descriptive(fishgrp$SL,Histogram=TRUE)

f1To visualize the Box plot, use the argument Boxplot=TRUE

descriptive(fishgrp$SL,Boxplot=TRUE)

f2To visualize the Dot plot, use the argument Dotplot=TRUE

descriptive(fishgrp$SL,Dotplot=TRUE)

f3To obtain descriptive statistics for each category, use the argument groups=. For example, the fishgrp data have observations for 4 different fish stocks i.e., Calcutta, Gujarat, Mumbai and Orissa.

descriptive(fishgrp$SL,groups=fishgrp$Stock,Boxplot=TRUE)

           N     Mean Median Minimum Maximum  Variance   Std dev Range
Calcutta 133 183.3459    183     160     211  95.31887  9.763138    51
Gujarat  113 197.9115    192     148     274 775.63496 27.850224   126
Mumbai   119 232.9832    234     157     299 534.62683 23.121999   142
Orissa   141 196.8794    187     127     266 996.27822 31.563875   139

f4To obtain descriptive statistics of sub-categories within groups, use the argument divisions=. For example, the fishgrp data have observations for males and females in each stock.

descriptive(fishgrp$SL,groups=fishgrp$Stock,division=fishgrp$Sex,Boxplot=TRUE)

               N     Mean Median Minimum Maximum   Variance   Std dev Range
F in Calcutta 59 185.0847  184.0     167     207   88.66511  9.416215    40
M in Calcutta 74 181.9595  181.0     160     211   97.51888  9.875165    51
F in Gujarat  52 194.3077  185.5     148     259  788.64857 28.082887   111
M in Gujarat  61 200.9836  198.0     155     274  756.64973 27.507267   119
F in Mumbai   65 232.8000  234.0     187     299  460.72500 21.464506   112
M in Mumbai   54 233.2037  233.0     157     287  633.86338 25.176644   130
F in Orissa   62 198.8387  196.5     127     266 1218.07192 34.900887   139
M in Orissa   79 195.3418  184.0     152     251  830.15093 28.812340    99

f5The same information can be viewed using a histogram or dot plot by simply changing the argument. But use only one at a time.

Thanks for reading 🙂

Advertisements

About Deepak George Pazhayamadom

I'm a fish biologist and a mathematical modeller. I have a wide range of research interests, mostly centered on fisheries resource management.

Posted on August 12, 2013, in TFSA. Bookmark the permalink. 1 Comment.

  1. useful package for statistics in fishery

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: