Thursday, January 31, 2013

Re-reading the original paper.

 Currently still reading Computtaional Methods in acoustics. Printed the second half onto paper yesterday. I think this is something that I need to draw on while I read... It's like a math text book and I feel that I probably only need to read it to know a big picture, and where to refer back if I ever need it later.

Re-reading the SPREAD paper after skimming some of the other readings made the paper more understandable. This time, I actually have a vague idea of what HAC is, and what other keywords in the paper are. However, understanding the paper a little more also makes my project proposal a lot more intimidating. (ah ha)

For example: human speech and recognition. From what I understand, SPREAD's recognition happens through HAC of about 100 different enviroment sounds. Would I be running HAC through the 42 different phenemomons of english? Also from what I know of consonants and vowels... you can't really propogate consonants because many times they are just inferred during perception...

TODO: what is TLM... 

Monday, January 28, 2013

Todo list this week

Read a lot of things, and produce a presentation on friday. 5 - 10 minutes oral presentation
and a 20 minute presentation on either item 4 or 5
 
1- pengfei's sound paper (done)
2-   music and computers web text book (done)
3- Computtaional Methods in acoustics (next)
4- Interactive Physically-based Sound Simulation (seend before, but not really read)
5- Zheng's Thesis 

I've seen the Interactive Physically-based Sound Simulation for a project I did in 563 for sound generation, but I've never really read the thesis. I actually don't really know how to read thesis papers. They are 100+ pages long... I've just ctrl+f the key word I may be looking for and read that particlar section, but I'm not sure if this is the correct way to read thesis papers.

Thursday, January 24, 2013

Blah Just realized the full proposal is due tomorrow

Meetings Friday 2 - 3 from now on. Do a presentation by next friday for 15 minutes... Just transferred all the content from the tumblr blog to this one I need to pump out a full project proposal for tomorrow. (Ah ha, I have no idea what I'm writing about).

Example of speech represented via sin waves

Example of speech in sin waves Around half way down the chapter 4.2 page for "Music and Computers". This is an example of speech done with sin waves. It is a bit concerning for me because I’m guessing the system I’ll be using will be using sin waves, and speech sounds may sound like this.

Music and Computers

Music and Computers One of the best resources on how music and computers and sampling rates (not enough math for my taste but it is very understandable). I’m on chapter 4. C: I’ve been doing this instead of my 3D-scanner work like I should. I think I need some time getting used to the idea of bandwidth and sampling rate and such. (I don’t know how to describe it, there’s this extra dimension that I’m not used to when dealing with signals… A little bit like nested for-loops in the beginning of programming). You have sampling rate (which gets the amplitude at each time step); you have bits per sample. I’m missing the idea about how the two are related. I know Nyquist Shannon says you need to sample at a rate 2 times the highest frequency to not have aliasing. So maybe sampling rate and how many buckets (which is related to bits per sample) are related. Apart from that, I’ll be doing 3D-scanning stuff and maybe reading the rest of this link this weekend at pennapps.

Project Proposal

Last semester I was planning to do Senior Design on a 3D-scanning project. This semester, I decided that since I really like sound, I should do what I like. I met with Robo cup and Professor Badler (+Pengfei). I think I am going with the graphics project over the robotics project because I don’t know I have enough background in hardware to deal with the numerous hardware failures in robotics. (pasting my 1 page project brief) Jiali Sheng Project Brief: I’ve always been interested in sound, and I decided to change my project to something relating to sound this semester. I’ve talked with professor Badler and Pengfei, and became interested in “SPREAD”, and would like to work on how sound distorts over distance in the system. Most specifically, how speech (English) distorts over distance. I would first test the system with the 40 phonemes in English and evaluate the cohesiveness of these phonemes as they distort in the simulation over space. The prediction is that because different frequencies of sound will distort differently in SPREAD, the level of distortion will be different with different phonemes. As a result, speech will sound strange (at best), or incoherent through the system. My project will be focused on making English speech more coherent in the system. Without really knowing how SPREAD works, I am unable to give a concrete plan for the actual algorithm. However I do know that similar problems have been looked at by phone companies because higher frequencies distort more than lower frequencies. To solve this, they sample voice at at least 8 kHz. Voice transcription services also have dealt with similar problems, and they employ a guessing system where if one syllable isn’t clear, they will calculate a list of possible syllables that is likely to appear there and take a guess. To me, I think this problem is a lot more interesting than the problem of scanning better frescoes via a better scanner. And I hope that I will have approval to do this project as my Senior Design. My project blog will likely be located here instead: http://soundgen.tumblr.com/ . This is the blog I used for my physics based animation final, but because the topic is more similar, I think it’ll be a better place.