Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
239 changes: 239 additions & 0 deletions 21.html

Large diffs are not rendered by default.

1,974 changes: 1,974 additions & 0 deletions Assignment4_Seq_Homo_Search/seq_search_handout.ipynb

Large diffs are not rendered by default.

99 changes: 99 additions & 0 deletions FinalProblemSet/.ipynb_checkpoints/Problem2-checkpoint.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2021-11-29 22:29:01-- https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/variants/humsavar.txt\n",
"Resolving ftp.uniprot.org (ftp.uniprot.org)... 128.175.240.195\n",
"Connecting to ftp.uniprot.org (ftp.uniprot.org)|128.175.240.195|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 7857000 (7.5M) [text/plain]\n",
"Saving to: ‘humsavar.txt.1’\n",
"\n",
"100%[======================================>] 7,857,000 26.3MB/s in 0.3s \n",
"\n",
"2021-11-29 22:29:01 (26.3 MB/s) - ‘humsavar.txt.1’ saved [7857000/7857000]\n",
"\n"
]
}
],
"source": [
"wget https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/variants/humsavar.txt"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"import csv\n",
"\n",
"txt_file = r\"humsavar.txt\"\n",
"csv_file = r\"humsavar.csv\"\n",
"\n",
"in_txt = csv.reader(open(txt_file, \"r\"), delimiter = '\\t')\n",
"out_csv = csv.writer(open(csv_file, 'w'))\n",
"\n",
"out_csv.writerows(in_txt)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"List of column names: [{'--------------------------------------------------------------------------------': ' UniProt - Swiss-Prot Protein Knowledgebase'}]\n"
]
}
],
"source": [
"with open ('humsavar.csv', 'r', newline='', encoding='utf8') as humsavar:\n",
" read_humsavar = csv.DictReader(humsavar, delimiter = '\\t')\n",
" list_of_column_names = []\n",
" for row in read_humsavar: \n",
" list_of_column_names.append(row)\n",
" break\n",
" print(\"List of column names: \", list_of_column_names)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
113,103 changes: 113,103 additions & 0 deletions FinalProblemSet/9606.tsv

Large diffs are not rendered by default.

Binary file added FinalProblemSet/9606.tsv.gz
Binary file not shown.
19 changes: 19 additions & 0 deletions FinalProblemSet/Problem1.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
args <- commandArgs(trailingOnly = T)

filename <- args[1]
#print(args)



pfam <- read.table(filename, skip = 3, header = F)
#View(pfam)




acc <- args[2]
loc <- as.numeric(args[3])

cat (pfam[(pfam$V1 == acc) &
(loc >= pfam$V4) &
(loc <= pfam$V5),7])
99 changes: 99 additions & 0 deletions FinalProblemSet/Problem2.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2021-11-29 22:29:01-- https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/variants/humsavar.txt\n",
"Resolving ftp.uniprot.org (ftp.uniprot.org)... 128.175.240.195\n",
"Connecting to ftp.uniprot.org (ftp.uniprot.org)|128.175.240.195|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 7857000 (7.5M) [text/plain]\n",
"Saving to: ‘humsavar.txt.1’\n",
"\n",
"100%[======================================>] 7,857,000 26.3MB/s in 0.3s \n",
"\n",
"2021-11-29 22:29:01 (26.3 MB/s) - ‘humsavar.txt.1’ saved [7857000/7857000]\n",
"\n"
]
}
],
"source": [
"wget https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/variants/humsavar.txt"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import csv\n",
"\n",
"txt_file = r\"humsavar.txt\"\n",
"csv_file = r\"humsavar.csv\"\n",
"\n",
"in_txt = csv.reader(open(txt_file, \"r\"), delimiter = '\\t')\n",
"out_csv = csv.writer(open(csv_file, 'w'))\n",
"\n",
"out_csv.writerows(in_txt)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"List of column names: [{'--------------------------------------------------------------------------------': ' UniProt - Swiss-Prot Protein Knowledgebase'}]\n"
]
}
],
"source": [
"with open ('humsavar.csv', 'r', newline='', encoding='utf8') as humsavar:\n",
" read_humsavar = csv.DictReader(humsavar, delimiter = '\\t')\n",
" list_of_column_names = []\n",
" for row in read_humsavar: \n",
" list_of_column_names.append(row)\n",
" break\n",
" print(\"List of column names: \", list_of_column_names)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
30 changes: 30 additions & 0 deletions FinalProblemSet/Problem3.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title: "Problem 3 of Final Set"
output: html_notebook
editor_options:
chunk_output_type: inline
---

This is an [R Markdown](http://rmarkdown.rstudio.com) Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing *Cmd+Shift+Enter*.

```{r}
plot(cars)
```

Add a new chunk by clicking the *Insert Chunk* button on the toolbar or by pressing *Cmd+Option+I*.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the *Preview* button or press *Cmd+Shift+K* to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike *Knit*, *Preview* does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.

```{r}
hum <- read.csv("humsavar.csv")

plot(hum$)

```



309 changes: 309 additions & 0 deletions FinalProblemSet/Problem3.nb.html

Large diffs are not rendered by default.

39 changes: 39 additions & 0 deletions FinalProblemSet/Problem4.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: "Problem 4 of Final Set"
output: html_notebook
editor_options:
chunk_output_type: inline
---

This is an [R Markdown](http://rmarkdown.rstudio.com) Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing *Cmd+Shift+Enter*.

```{r}
plot(cars)
```

Add a new chunk by clicking the *Insert Chunk* button on the toolbar or by pressing *Cmd+Option+I*.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the *Preview* button or press *Cmd+Shift+K* to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike *Knit*, *Preview* does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.

```{r}
r <- read.table(
"ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/variants/humsavar.txt",
header = F,
skip = 49, sep = "", fill = T,
stringsAsFactors = F, flush = T,
nrows=78710)
r<- r[, -ncol(r)]

abundance <- read.table(
"ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/variants/humsavar.txt",
header = F,
skip = 49, sep = "", fill = T,
stringsAsFactors = F, flush = T,
nrows=78710)
```


299 changes: 299 additions & 0 deletions FinalProblemSet/Problem4.nb.html

Large diffs are not rendered by default.

27 changes: 27 additions & 0 deletions FinalProblemSet/Problem5.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
title: "Problem 5 of Final set"
output: html_notebook
editor_options:
chunk_output_type: inline
---

This is an [R Markdown](http://rmarkdown.rstudio.com) Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing *Cmd+Shift+Enter*.


Add a new chunk by clicking the *Insert Chunk* button on the toolbar or by pressing *Cmd+Option+I*.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the *Preview* button or press *Cmd+Shift+K* to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike *Knit*, *Preview* does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.


Fermi Estimation

How many pizza restaurants are in New York City?


To solve this problem I would determine how many restaurants there are in New York. And then based off of how popular a New York pizza is, determine how many of those restaurants may be a pizza place.

There are probably appoximately 30,000 restarants in all of New York City. Based on that I feel that pizza is a grab and go type of deal when it comes to the city. So there may be some smaller restaurants on the corners of streets and such. There is also a wildly diverse selection of restaurants in most large cities so I would assume that there are several highly compeitive varieties of restaurants within the city. I think then to divide up the types of restaurants being: Chinese, Indian, Italian, Sandwich shops, Burger joints and so on, I would probably divide the number of restaurants by 10 putting that into a possibility of types. This would give you nearly 3,000 restaurants of each type. This you would then have to account that Italian restaurants may serve pizza. So to knock it down and say that this is true we would single out the restaurants that are strictly pizza. For this could still be a high number I would assume that there may a few franchises of pizza places and bring that down to 700 true pizza places in the New York City area. And finally to account for the franchising of some places I would say there may be around 400 pizza places in all of New York City.
292 changes: 292 additions & 0 deletions FinalProblemSet/Problem5.nb.html

Large diffs are not rendered by default.

Loading