Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions 2013-09-07.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
This week, I have been introduced to a lot of new stuffs that I have never met. Being in the "R-dominating" world of statistics for too long results me in the situation of knowing nothing else but R. That is kind of sad to realize that in my last year of college. The desire to know more about other technologies has driven me to stay in this class which is full of unfamiliar things. I have met many "technical difficulties" this very first week. Some of them are as small as not knowing how to folk an updated repository or how to register a username on IRC. These small questions were quickly solved during the group time on Tuesday. Actually some of my groupmates had the same confusion as I had. I think those problems occor because of we are still new to those settings, once we try some more times , we will get familiar and become proficient. However, there are some bigger questions. For me, the most crtitical challenges are all about the virtual machine.

I met the first ostacle when I had to enable hardware-level virtualization in BIOS. It sounds silly that I had tried a couple of times turning my computer on and off trying the F-bottoms to get me to my BIOS, and I ended up going online and google the key to access BIOS for sony computers. I will remember from now on that F2 gets vaio's to the BIOS settings.

Then my next obstacle came up when I failed to command the virtual machine to do anything. I first brought this up in my group during class and we found out that something must be wrong because I had never been asked to set up login name and password during the whole process of setting the machine. Then I turned this question to Aaron, and I was told that my virtual machine is not fully installed. Besides emailing Aaron for an appointment to get the installation done, I also tried many other ways to figure out what I had missed when it is supposed to be done. I googled many virtualbox installation illustration steps as well as FQAs. At the same time, I also went back and traced the procedure posted in the homework-01.md again. But I still had no clue whether I should uninstall everything and start over again since it was not fully installed. What I googled did not help that much. Thinking uninstall and reinstall might take too much work and might not be necessary, I thought of maybe removing the old VM and recreating it again may be worth-trying to reduce the pool of possible mistakes. So I tried and succeeded! I had my "Aha!" moment when I figured out I did not choose to install the server when I was setting the machine... Though when I realized this time and I follow the steps for installing, I failed to install all the softwares when we was asked to choose what software I want to install besides the core of the system; I went back and unchecked the "manual package selection" then it succeeded to install completely. Everything looks "normal" now--same as everybody else in my group. And I tried to run the command we used in class on Thursday about ipython notebooks and the VM reacts normally. So I assume I have made it work! Of course, I will still check with others or the instructors that I have actually succeeded.

Looking back at this week, I did a lot of trying and testing--trying different approached and testing if they works. I have noticed that sometimes it is a good thing to pause and step back for a while to think about it from another starting point then try a different approach instead of keep trying over and over again and get the same error.
30 changes: 30 additions & 0 deletions 2013-09-14.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
reflections-1
=============
2013-09-14

This week, we have been working on installing vagrant. Actually I am not quit sure about what we are trying to get
'vagrant up' to work for. I have some vague feeling that Aaron wants us to do something with tht ipython notebook with
this 'vagrant' the the command processor. To be more honest, I have been having this vague feeling ever since we had
touched on the virtual machine. Because I am not a CS major, and I think many of us aren't either, I don't have enough
basic knowledge about computers. So sometimes I was not able to follow the ideas fluently, such as how to use ** software
to help us build a *** environment which allows group contributions towards the same project. I know this sentence sounds
non-sense, but that is the best I tried to catch. Regarding to my difficulty of catching the objectives, I hope we can
have more clearly explained objectives so that I can have basic ideas about what we are aiming for and what a specific
step will help us achieve the objective. And I also think it would be helpful if we could have more clear instructions
on each step. If so, we will be more clear about whether we have done the stpes correstly, even if we haven't we can still
follow the later steps as soon as we get previous roadblocks solved.

Something that I like regarding this week is the way that we have our groups set up. Within each group, we have our
technical lead and our operational lead. I think it is pretty efficient that we can have some small problems solved within
our group first before asking for GSIs' help. And we actually have the opportunity to share our progress and roadblocks
with our groupmates. In such a way, we don't feel lost and fallen too behind anymore. Breaking into groups makes the goal
of keeping everybody on the same page more achievable. On Thursday, we had discussed many issues that we think should be
improved. We all think we need more communications outside of class, especially when IRC does not really help when we are
all offlined. We have similar feelings that we should have more clear agenda for everything.

Regarding to more off-class communications, our group decided to try with facebook groups. We already have our facebook
group active and started to keep track of each other. As a operational lead of the group, I asked our group to briefly
report how far they had gone successfully. We have slightly different stages around getting 'vagrant up' to work and further
attempts to access ipython on the browser. I think I have 'vagrant up' run and virtualbox ready (according to GSI Christ).
But I don't know what further steps I should be working on...So I am still waiting for further instructions, and so is my
group.
12 changes: 12 additions & 0 deletions 2013-09-21.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
2013-09-21--Siyang Zeng
=============

This week is mainly about research experiences and how statistics plays a role in analyzising research results, shared by the two guest lecturers. Aside of that, we are asked to keep ourselves on track--tackle vagrant related problems and have ipython notebook opened successfully on the web browser through virtual box.

Windows system:
I have windows on my computer originally. I followed everything for running vagrant, however I had problem running 'vagrant ssh' as it returned an error message saying that I did not have SSH installed in my computer. Then I spent a lot of time figure out which ssh client to install and how to install and configure before the GSI announced that we are going to run vagrant on linux system for every windows users instead. I think there's a problem with the gap between what the professor expected us to have already known and what background konwledge about computers stat-students actually have. As we are expected to have ssh installed in our computers(or ssh may be automatically installed for certain systems) but we do not even know what ssh is. Before I got linux system installed, I had succeeded accessing ipython notebook from a command terminal and will be directed to the web browser but not through the virtual box.

Linux system:
The GSIs had scheduled a session for helping us install linux system and get everything on track. I think that session was sufficiently helpful. During that seesion, we could pair up or group with people that had the same system and were at the similar stage of installing. In such a way, we could discuss and help each other with problems that one may faced before and solved by the GSI. I really liked the way Chris summarized all the steps from starting vagrant to access ipython notebook from the web browser. The list of steps not only helped us keep track of what steps we had done and what had not, but also helped us to be able to reproduce and perform the procedures on our own instead of just following and not knowing what the steps are for. I have successfully completed all the steps and am able to do it on my own. I think my next thing to do is to get familiar with linux system.

Another thing I am concerning is the correct way of 'submitting' our reflections. The way I, and my group, have done is creating a new repository "Reflections" and starting to post new files for each week's reflection. However I have heard that we should fork the professor's 'reflection' repository and edit by posting our reflections, then we should pull request to the professor. I did not know this until Tuesday. And I saw that only a few of our classmates actually did pull requested their refections. GSI said she will ask the professor again for detail instructions about submitting reflections. But I think I should address this with my group, and suggest my group mates to fork and pull request for now while waiting for further instructions.
12 changes: 12 additions & 0 deletions 2013-09-28.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
2013-09-28(Siyang Zeng)
=============

About reproducibility:
This week we talked about reproducibility. Through lectures, ideas and experiences shared by Chris and Kristina, I started to have some sense about the point of reproducibility. To be honestly, I have never heard or thought about reproducibility before. Because this topic seems not popular and widely influencing, I have never really thought about the meaning and importance behind this.

However, after this week, I gained many thoughts about it. In my opinions, I think the point of emphasizing reproducibility is about accuracy and reliability of data used to support any conclusions. This idea could be blended into a statistics sense. As we usually model simulations and experiments with independent subjects and run various tests to try to find out if a specific phenomenon appeared by chance or it is meant to be.

So the fact that if one could reproduce the whole modelling process and get the same result, which could be thought as if a phenomenon would happened if an experiment is to be repeated under the same condition, would be necessary to verify that a result is reliable. This is the way I understand reproducibility. But it is the only point I see why reproducibility is important in data science up to this point.

About the course:
As we have went through all the preparation procedures(installing everything), I am much clearer about what we are heading to now. Especially after Aaron had explained why virtual machine is necessary for our course, I finally see the point of installing linux system just to get virtual machine to work well. And the way Chris illustated the relations between all the different stages of a research study (observations-->data-->model) also helped us see what our roadmaps should be.
21 changes: 21 additions & 0 deletions 2013-10-05.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Reflection of the week 2013.09.30-2013.10.04
-------------------------
On Tuesday, we are categorized according to our leanring styles as data curators, data analyzers, visualizers, and presenters. And we are assigned into groups of 4 to finish a project. Project explanation is uploaded on github.

On Thursday, Prof. Stark continued his presentation about earthquake predictions.
Notes are on: https://github.com/SunnySunnia/Group7/blob/master/2013-10-03.md

Some concerns or feedbacks on this week:
I think the way that the project is just uploaded on github without a clear explanation directly to us makes us felt confused for many things:
For example, what are we supposed to do with the example.cfg file; where should we input our real bConnected key whereas we should not.
How to import the gspread into ipython notebook.
We even dont know who will be working with us in a group until the end of Thursday.
What is a github push?
etc.

Fortunately, we can post issures on the questionnaire repository. That is helpful in the way that we can see what problems other classmates are facing and what are the solutions to those problems.
But still, I think it would be more efficient if we can first meet our group in class first and get some consents on the plan of everything (deadlines for each role, when to meet with instructors, when to meet for final check with everything, etc).

Furthermore, many of us are new to python. It takes us really long to figure out how to graph on the ipython-notebook by ourself. I think it would be much easier if we could have a tutorial session together.

Right now at this moment, our group's data is not yet ready for further analysis. And I, the visualizer, just figured out how to plot functions, but still need to learn how to plot histograms or other statistically plots.
22 changes: 22 additions & 0 deletions 2013-10-12.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Reflection on the project:
-------------------------------------------------------
* I am the visualizer of Group1.
- On our way to finish the project, we had met many obstacles, most influencial ones are all technical ones: having problem saving .cfg file and fluently access spread sheet on ipynb, having no experiences on data processing and graphing on ipynb.
- we spent too long on just data curating, we got our final dataset on Saturday night.
- We had set our goal as making our project reproducible from the very beginning, so we decided to try everything on ipynb. I am not sure what exactly my other groupmates had been through, I had spent a long hard time figuring out how to graph on ipynb, from scratch. Luckily I figured out at least enough for us to present our findings.
- There was a small problem that one of our groupmates was out of town and basically too busy with other scheduled goals, we could not have a chance to meet everybody to discuss what each of us should be expected to complete. But luckily, again, we have pretty efficient conversations via facebook and google handout.
- We have a facebook group within the horizontal visualizers' group, which I think is a good way to communicate; however, each of us is in a different vertical group with different speeds and defferent approaches on visualizing, we in fact can only discuss as gerneral as what kind of plots are being made for the vertical groups. Also, due to matter of time, I became too busy to check out how everybody else was doing in our horizontal group as later approaching the presentation date.

+ (within vertical group)
-------
- efficient communications.
- understood role of each member.
- though there are a lot left for us to figure out from scratch by ourselves, we did not give up on searching and trying.
- We had met Aaron and the GSI's, meeting and talking to them made us much clear on everything. We had many technical problems solved and become not as panic as before.


delta
-----------------
- should start earlier.
- should spend some time on explainning to groupmates how each part was completed, for instance, I should explain to my groupmates how my codes contribute to the plots so that I would not be the only one who can make modifications on the graphs; same for data curator and analyzer. One of our goals should also be learning, so we should all know what to do with each part.
- should plan ahead what are the expectations for each step: what final format of the data the analyzer and visualizer want from the curator, what kinds of plots the analyzer want to illustrate his findings with, how detail should all previous parts be explained so that the presenter knows which step to emphasize on, etc.
14 changes: 14 additions & 0 deletions 2013-10-19.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Reflection 2013-10-19
----

This week we are assigned a new homework focusing on parsing json data and parametrizing visual functions.
I think the focus of the week is how do you make sure people in the future can access the data that you used to drive your conclusion no matter how many years later.

Regarding this homework, my role is to modify the codes so that we can visualize the earthquakes of whichever states we want to focus on. I feel more comfortable with this project, maybe because we are more clear about everything.

Something that I am working on and trying to improve is to finish up setting up the windows of the map for each state.

I have many "aha" moments whenever I figured out something new on plotting on ipynb.
--One specific one is when I figure out a way to combine everything into a function (including setting the corner coordinates) for every state but alaska! And I found out with Aaron that the reason is because Alaska is the state crosses both east and west atmosphere: it longitude runs from 130°W to 172°E.

So now I have to figure out how to solve this bug.
26 changes: 26 additions & 0 deletions 2013-10-26.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Reflection
---

This week, we discussed a rough roadmap for this class for the rest of the semester.
For the rest of the semester, we have at least 3 things to do:
* Program (reproducible)
---data: where from? what formaat? what inpu to model? what output?
---model: ETAS, 'simple/starks', poisson
---Github, AWS

* Paper
---abstract-summary,ETAS
---intro
---methods-how did you do it? steps
---esults(negative result?)
---citation-format? how many?

* Public understanding of science
---for people dont want to read the paper in detail
---update wikipedia?
---short presentation?

I think having some time during a week to talk about future goals and agendas will be a good way to keep things on track. Doing so keeps us from feeling lost about why we did something or what to do next.

However, we still need more ideas about how to divide groups and tasks. I think one thing that helps us figure those out is the issue tracker in the new repo. That collects all of our questions about everything, most of the questions posted are still waiting for answers from Aaron. As we will be split in to groups not only depending on roles but also depending on tasks, I think we need more instructions or explanations on how and when to slip. For example, do we discuss and come up with different tasks in class then we sign up to the tasks? Or could we form our own groups and Aaron assigns us different tasks? Or a mix of the previous?

17 changes: 17 additions & 0 deletions 2013-11-02.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
I think we have been discussing and giving feedbacks all the time on lectures during this week.

As a visualizer, my goals are:
* Be clear about each step of this project, ie, what each of the horizontal groups' role.
* Be familiar with plotting tools.
* Have rough ideas about what plots we will need.
* Figure out what we need from analyzers and/or data curators.

Some questions to consider:
* What format do I want the data in?
* How should I start working if I havent received the data yet?
* How has the data been visualized in Luen's paper?
* What modifications should be done on top of Luen's visualization?
* Would visualizations be different between the paper vs. presentation?

What I am working on now:
* use the earthquake data from last homework, plot some different graphs to see if I can find some patterns on the occurances of the earthquakes.
Loading