# The freq.class function of TFSA

1. Download, Install and Open R

2. Install the R packages: lattice and TFSA (R>Packages> Install packages from local zip files). This is required only once. If the package is successfully installed, you will get the message “package ‘TFSA’ successfully unpacked and MD5 sums checked”.

3. Load both packages using the library function.

Example: library (lattice); library(TFSA)

**The freq.class function**

freq.class function builds frequency distribution table from a vector of data by grouping them into class intervals. The *frequency* of a class interval is the number of observations that occur in a particular predefined interval. So, for example, if 20 fish samples in our data are aged between 3 to 5 , the frequency for the 3–5 interval is 20. The *endpoints* of a class interval are the lowest and highest values that a variable can take. *Class interval width* is the difference between the lower endpoint of an interval and the lower endpoint of the next interval.

**Mandatory arguments (inputs):**

`'x'- `

A vector of data

**Mandatory arguments for specifying class interval (inputs):**

`'ll'- `

Lowest endpoint of class intervals

`'ul'- `

Uppermost endpoint of class intervals

`'cl'- `

Preferred width of the class interval.

The above arguments work only if all three of them are used together. Unless specified, the function categorize data into 10 classes

**Optional arguments (inputs):**

`'groups'- `

The ordinal (factorial or categorical) variables if data belong to more than one group. Default is NA

`'scales'- `

If TRUE, bar graph for groups will be plotted independent of their relation in size. Default is FALSE

`'density.plot'- `

If TRUE, density graph will be plotted instead of bar graphs. Default is FALSE

**Protocol to use freq.class function**

1. Prepare data with variables in each column (qualitative and quantitative)

2. Import your data into R using the function read.table () or alternatively for learning purposes, you can use the ‘fishdata’ that comes with TFSA package. This data has 506 observations on fish length.

> data(fishdata) > fishdata

2. Find the lowest and highest values of the variables using the functions max () and min ().

3. Decide on width of the class interval.

In deciding on the width of the class intervals, you will have to find a compromise between having intervals short enough so that not all of the observations fall in the same interval, but long enough so that you do not end up with only one observation per interval.

Usage:

1. To get a quick result, use freq.class() with the data vector. This will categorize data into 10 classes.

> freq.class(fishdata) Frequency Distribution Table Lower Upper Midclass Frequency 1 127 144 135.5 1 2 144 161 152.5 18 3 161 178 169.5 104 4 178 195 186.5 137 5 195 212 203.5 63 6 212 229 220.5 68 7 229 246 237.5 57 8 246 263 254.5 45 9 263 280 271.5 8 10 280 297 288.5 4

2. To customize the class intervals, use the arguments ‘ll=’, ‘ul=’ and ‘cl=’. The fish data is in the range between 127 – 299. So let’s keep the lowest end point 125, highest endpoint 300 with a class interval of 7.

> freq.class(fishdata,ll=125,ul=300,cl=7) Frequency Distribution Table Lower Upper Midclass Frequency 1 125 132 128.5 1 2 132 139 135.5 0 3 139 146 142.5 0 4 146 153 149.5 5 5 153 160 156.5 9 6 160 167 163.5 23 7 167 174 170.5 48 8 174 181 177.5 61 9 181 188 184.5 76 10 188 195 191.5 37 11 195 202 198.5 22 12 202 209 205.5 30 13 209 216 212.5 25 14 216 223 219.5 31 15 223 230 226.5 27 16 230 237 233.5 22 17 237 244 240.5 25 18 244 251 247.5 26 19 251 258 254.5 21 20 258 265 261.5 8 21 265 272 268.5 2 22 272 279 275.5 2 23 279 286 282.5 3 24 286 293 289.5 1 25 293 300 296.5 1

3. If the entire data belongs to different groups, this can be specified with the argument ‘groups=’.

TFSA package comes with a second example data known as ‘fishgrp’. This data is a vector of fish length measured from different commercial landing centres in India. The data have two columns in which one is factorial variable (landing centres) and the other one is quantitative variable (fish length).

If you are using your own data, make sure it is formatted in the standard way

> data(fishgrp) > str(fishgrp) 'data.frame': 506 obs. of 2 variables: $ Stock: Factor w/ 4 levels "Calcutta","Gujarat",..: 3 3 3 3 3 3 3 3 3 3 ... $ SL : int 228 264 244 209 220 202 242 256 249 170 ... > freq.class(fishgrp$SL,groups=fishgrp$Stock) Frequency Distribution Table Lower Upper Midclass Calcutta Gujarat Mumbai Orissa 1 127 144 135.5 0 0 0 1 2 144 161 152.5 1 4 1 12 3 161 178 169.5 36 25 1 42 4 178 195 186.5 81 33 3 20 5 195 212 203.5 15 21 14 13 6 212 229 220.5 0 11 33 24 7 229 246 237.5 0 7 32 18 8 246 263 254.5 0 11 25 9 9 263 280 271.5 0 1 5 2 10 280 297 288.5 0 0 4 0

4. In the above graph, Calcutta have a frequency of 80 in one particular graph. Because of this, the true nature of frequency distribution from other locations are not clear. The argument ‘scales=TRUE’ would make the graphs independent of each other. This will show a better picture.

> freq.class(fishgrp$SL,groups=fishgrp$Stock,scales=TRUE)

5. If you are interested in a density plot instead of bar plot, the argument ‘density.plot=’ can be used.

> freq.class(fishgrp$SL,groups=fishgrp$Stock,density.plot=TRUE)

Please leave a comment or email if you find a bug. Thanks for reading 🙂

Posted on June 24, 2012, in TFSA and tagged fish stock assessment, fisheries management. Bookmark the permalink. Leave a comment.

## Leave a comment

## Comments 0