Statistics 506, Fall 2016

Project


The project is due on December 19th at noon (to be submitted on Canvas).

The source for the data is here. You can use any of the “Vessel Tracking Data” on this page for your project.

In the project, you should formulate three clearly-defined research questions relating to these data sets. Then answer each question yourself, using the data to support your arguments.

The questions that you formulate should be limited in scope, similar to the homework questions we have had this term. The statement of each question, and your solution to it, should require no more than one page, not including any code. So your entire project (again, not including code), should be limited to three pages. In addition to these three pages of text, you should provide clean, well-documented code. There is no strict limit on the number of pages of code but you should try to make your code as concise as reasonably possible. Around 5-6 pages should be sufficient.

Unlike the homework sets, the projects are to be done independently. You should not discuss them outside of class with anyone. However we will reserve some class time to discuss the projects together. You may use any resources you like (books, internet, etc.). If you rely heavily on any one source you should cite it.

You can use any of the computing languages we have studied this term (Stata, R, SAS). And you may use libraries for these languages as needed (including libraries that we did not discuss in class).

Note that the zip archives that you obtain from the web site given above contain many files, but I suggest that you use only the dbf data file and the documentation (pdf) file. The other files are mainly GIS files, which we have not studied this term. The dbf file can be opened in Stata, R, and SAS.