We learn how to build nVidia GPU-based (super)computers, network and program them for HPC (High Performance Computing) using **CUDA C** language. We apply all this to do some astrophysical simulations related to forming planetary systems.

## Monday, May 31, 2010

### [from Pawel] how to compute tau? Party on Sat.

Suppose the disk is 2D, otherwise you'll have to generalize. Then each particle must deposit

its dtau (small contribution to tau).

we talked a lot about this with James amd he's your guru on that (right James?). ask him any questions. you first declare an array (2d: r,phi) that will hold dtau(r,phi)

we established that it is much better to use the so-called "scatter" approach than "gather" approach. in the first, you traverse the list of N particles, and donate their dtau's to the

appropriate grid locations (grid indexes are not stored but found each time you need them for a given particle, it should save time and space in ram). in the second approach, you'd traverse the

dtau grid and look for particles inside each grid cell (that's a lot of searching! ~N.)

we need exp(-tau) in the end, we can get it by integrating tau(r,phi) = integral_0^r dtau(r,phi)

= sum_along_r_from_the_innermost_to_the_particle's_cell of dtau(r,phi).

we've talked about a nice trick to avoid computing exponential functions later on.

suppose you have dtau(r) in an array and want exp(-tau). then you can get a very good approximation to f=exp(-tau) by

f = integral_0^tau (-f) dtau

why? because df(tau) = -f(tau) dtau is the differential equation with solution f=exp(-tau) .

you need to start with initial value f=1 at the inner radius and integrate outside.

what is dtau contributed by each particle? it's dA/(4pi r^2) where dA is its geom. cross section.

in general, we don't want to count particles, all we want is to arrive at tau(r,phi) which

in the initial axisymmetric state has tau(r=outer radius) = tau_infinity = 10 (or 3, or something you input). Therefore do an integration using dtau = C/r^2 where C=1 initially, and after getting

tau(outer) or exp(-tau(outer)) computed with the above algorithm, you recalculate C so in all the

later calls to the tau-computing procedure it gives the right outside exp(-tau_infinity) number.

* * *

We will meet on Friday downtown, meanwhile there seems to be enough work and enough computers already doing or starting to do cuda at UTSC to drop in there the rest of the week.

On Sat, as I mentioned, you are invited for a pool-side party (swimming encouraged,

take your swimming pants). bring a friend if you'd like.

I'll make some bbq, but if you want fancy salads maybe we can make it a potluck = byo.. party) hopefully the weather will cooperate so we can go flying for 20 min with each of you out of buttonville airport (north of 404 & 407). http://planets.utsc.utoronto.ca/~pawel/VANsRV.html

please let me know how many are coming and whether you have a car (it would be good to have

more than just my car for the buttonville trip).

finally, the coordinates (time will still be confirmed, we'll see if the Wx is ok the whole or just part of the day):

37 Fairhill Cres., toronto, 2pm, Sat 26 May. I can give you a 1-2km ride from from Underhill and Lawrence to my place, send me the details of how you're coming so we'll set some time to meet

at the gas station there. the bus that goes on Lawrence is called 54.

## Thursday, May 27, 2010

### [from Josh] OpenGL code

"look at how I did this under HTML view to see what a pre tag is"

The way that I have built my code everything that is needed to run the simulation (such as "math.h", "freeglut.h",...etc) is "#include"ed in one header file. Also within that header file I define the global variables that will be used for the simulation such as number of particles, current runtime, dt, solar mass,... etc. I also define a class object called particle and define a dynamic array of particle objects. I then include this first header file in another header file called "forces.h". In the header file "forces.h" I define all the different forces that I want to take effect in my simulation (currently just gravitational but its very easy to add in other forces). I then include the forces header file into another header file called "movePos.h". In "movPos.h" all the numerical integration takes place. It calls functions from "forces.h" to do so and updates the particles positions. I then include "movePos.h" into my graphic header called "3Ddisplay.h" which I have posted below. Here, within my render function I call functions in "movePos.h" which update the particle object array that I defined in the very first header file. I then graph the positions of each of the particles adding visualization techniques and also display data.

Up until now everything has been contained in seperate header files that are easily updated, debugged, and managed. You should know that header files release their properties to any file that includes them, (ie. things that include header files inherit all things within the header files). I then have a file (not a header) called "main.cpp". This file includes "3Ddisplay.h" and thus inherits all the other header files functions and global variables. In "main.cpp" I define a single function called "main" that interactivity asks the user what type of simulation they would like to perform, what the parameters of the simulation are and then calls the displaySim function in "3Ddisplay.h" which then displays the simulation as you requested. The program is terminated when the user presses "q" in the display window.

Also my simulation utilizes openMP and will run in parallel across all available CPU cores.

I tested my program using "extern" variables instead of in inherited header variables but the complexity rose considerably and made debugging a nightmare. I also test various pointer array object declaration techniques to see which one was the fastest and I found that the using my current method of memory allocation is the fastest. This is similar to the way we would do things with CUDA too since CUDA works fastest with "pinned memory" (memory that is allocated once and remains static for the remainder of the program).

If you would like to use the below code you will have to either a) modify it completely so that it works with your code format, or b) try and follow my format.

In general you really only need to do two things:

- change displaySim function to "main" and add #include freeglut.h at the top of your program

- in the render function replace the movePos3D with your own code that updates particle positions. (Note where it says pPart[i] is my particle object and you should replace it with what ever your particle is or the corresponding x,y,z,vx,vy,vz of your code.

Note: you will need to have freeglut installed on your system and locatable by your compiler. See my previous post on the topic for those using windows.

#include "movePos.h"

/*openGL vars*/

static int left_click = GLUT_UP;

static int right_click = GLUT_UP;

static int xold;

static int yold;

static int width;

static int height;

static int wh;

static int hw;

static int vidtype = 'n';

static int leavetrail = 0;

static int paused = 0;

static double rotate_x = 30;

static double rotate_y = 15;

static double alpha = 0;

static double beta = 0;

static double scale = 100;

static double trans = -0.1*scale;

/*openGL functions*/

void ReshapeFunc(int new_width, int new_height){

width = (double)new_width;

height = (double)new_height;

hw = height / (double) width;

wh = width / (double) height;

glMatrixMode(GL_PROJECTION);

glLoadIdentity();

gluPerspective(20 * sqrt(double(1 + hw * hw)), wh, 8, 12);

glMatrixMode(GL_MODELVIEW);

glutPostRedisplay();

}

void MotionFunc(int x, int y){

if (GLUT_DOWN == left_click){

rotate_y = rotate_y + (y - yold) / 5.f;

rotate_x = rotate_x + (x - xold) / 5.f;

if (rotate_y > 90) rotate_y = 90;

if (rotate_y < -90) rotate_y = -90; glutPostRedisplay(); } if (GLUT_DOWN == right_click){ beta = beta + (y - yold) / 2.f; alpha = alpha + (x - xold) / 2.f; glutPostRedisplay(); } xold = x; yold = y; } void keyCB(unsigned char key, int x, int y){ if( key == 'q' ){ free(pPart); exit(0); } if( key == '+' ){ scale *= 2.0/3.0; } if( key == '-' ){ scale *= 1.5; } if( key == 'r' ){ vidtype = 'r'; } if( key == 'v' ){ vidtype = 'v'; } if( key == 'n' ){ vidtype = 'n'; } if( key == 't' ){ leavetrail = abs(leavetrail - 1); } if( key == 'p' ){ paused = abs(paused - 1); } } void renderBitmapCharacher(float x, float y, float z, void *font,char *string){ char *c; glRasterPos3f(x,y,z); for (c=string; *c != '\0'; c++){ glutBitmapCharacter(font, *c); } } void Render(int face){ double D; int i; /* Axis */ glBegin(GL_LINES); glColor3d(1, 0, 0); glVertex3d(-0.7*scale, -0.7*scale, -0.7*scale); glVertex3d( 0.7*scale, -0.7*scale, -0.7*scale); glColor3d(0, 1, 0); glVertex3d(-0.7*scale, -0.7*scale, -0.7*scale); glVertex3d(-0.7*scale, 0.7*scale, -0.7*scale); glColor3d(0, 0, 1); glVertex3d(-0.7*scale, -0.7*scale, -0.7*scale); glVertex3d(-0.7*scale, -0.7*scale, 0.7*scale); glEnd(); glRotated(beta, 1, 0, 0); glRotated(alpha, 0, 1, 0); glColor3d(1, 1, 1); glBegin(GL_POINTS); if (face==0) movePos3D(); KE = 0.0; #pragma omp parallel for for (i=0;iwidth)

glOrtho(-scale, scale, -scale * hw, scale * hw, -scale, scale);

else

glOrtho(-scale * wh, scale * wh, -scale, scale, -scale, scale);

glMatrixMode(GL_MODELVIEW);

/* Perspective view */

glViewport(0, 0, width / 2, height / 2);

glPushMatrix();

/* Display data */

char Npart_str[33];

sprintf(Npart_str,"N=%d",Npart);

renderBitmapCharacher(-scale,scale - 0*scale*0.06,0,(void *)GLUT_BITMAP_8_BY_13,Npart_str);

char G_str[33];

sprintf(G_str,"G=%2.4Lf",G);

renderBitmapCharacher(-scale,scale - 1*scale*0.06,0,(void *)GLUT_BITMAP_8_BY_13,G_str);

char t_str[33];

sprintf(t_str,"t=%2.4Lf",t);

renderBitmapCharacher(-scale,scale - 2*scale*0.06,0,(void *)GLUT_BITMAP_8_BY_13,t_str);

char h_str[33];

sprintf(h_str,"h=%2.4Lf",h);

renderBitmapCharacher(-scale,scale - 3*scale*0.06,0,(void *)GLUT_BITMAP_8_BY_13,h_str);

char scale_str[33];

sprintf(scale_str,"axis scale=%2.4Lf",scale*2);

renderBitmapCharacher(-scale,scale - 4*scale*0.06,0,(void *)GLUT_BITMAP_8_BY_13,scale_str);

char K_str[33];

sprintf(K_str,"Kinetic Energy=%2.4Lf",KE);

renderBitmapCharacher(-scale,scale - 5*scale*0.06,0,(void *)GLUT_BITMAP_8_BY_13,K_str);

glTranslated(0,0,trans);

glRotated(rotate_y, 1, 0, 0);

glRotated(rotate_x, 0, 1, 0);

Render(0);

glPopMatrix();

/* Right view */

glViewport(0, height / 2 + 1, width / 2 + 1, height / 2);

glPushMatrix();

glRotated(90, 0, -1, 0);

Render(1);

glPopMatrix();

/* Face view */

glViewport(width / 2 + 1, height / 2 + 1, width / 2, height / 2);

glPushMatrix();

Render(2);

glPopMatrix();

/* Top view */

glViewport(width / 2 + 1, 0, width / 2, height / 2);

glPushMatrix();

glRotated(90, 1, 0, 0);

Render(3);

glPopMatrix();

/* Projection matrix is restored */

glMatrixMode(GL_PROJECTION);

glPopMatrix();

glMatrixMode(GL_MODELVIEW);

/* End */

glFlush();

glutSwapBuffers();

}

/* Run upon click */

void MouseFunc(int button, int state, int x, int y)

{

if (GLUT_LEFT_BUTTON == button)

left_click = state;

if (GLUT_RIGHT_BUTTON == button)

right_click = state;

xold = x;

yold = y;

}

/* Timed refresh to control refresh rate */

void TimerdisplayCB(int value){

if (paused == 0) glutPostRedisplay();

glutTimerFunc(33,TimerdisplayCB,value);

}

void displaySim(int *argcp, char **argv){

/* Creation of the window */

glutInit(argcp, argv);

glutInitDisplayMode(GLUT_RGBA);

glutInitWindowSize(glutGet(GLUT_SCREEN_WIDTH),glutGet(GLUT_SCREEN_HEIGHT));

glutCreateWindow("Particle Simulation");

/* OpenGL settings */

glClearColor(0, 0, 0, 0);

glEnable(GL_CULL_FACE);

glCullFace(GL_BACK);

glEnable(GL_BLEND);

glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);

/* Declaration of the callbacks */

glutDisplayFunc(DisplayFunc);

glutReshapeFunc(ReshapeFunc);

glutKeyboardFunc(keyCB);

glutMouseFunc(MouseFunc);

glutMotionFunc(MotionFunc);

glutTimerFunc(33,TimerdisplayCB,0);

/* Main display */

glutMainLoop();

}

## Wednesday, May 26, 2010

### [from Robert] cudak4 & cudak5 up and running

### [from Anthony] some CUDA examples

https://visualization.hpc.mil/wiki/GPGPU

n=10 iterations=1000 time_gpu=6586.000 time_cpu=1601.000

n=10 iterations=100 time_gpu=615.000 time_cpu=150.000

n=60 iterations=100 time_gpu=617.000 time_cpu=5980.000

I won't be posting their codes - it's pretty length-y. They also have a plot where they compare the CPU/GPU times of executing the program.

By the way, the devices I have in my macbook pro are...

Device 0: "GeForce 9600M GT"

CUDA Driver Version: 3.0

CUDA Runtime Version: 3.0

CUDA Capability Major revision number: 1

CUDA Capability Minor revision number: 1

Total amount of global memory: 268107776 bytes

Number of multiprocessors: 4

Number of cores: 32

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 8192

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 2147483647 bytes

Texture alignment: 256 bytes

Clock rate: 1.25 GHz

Concurrent copy and execution: Yes

Run time limit on kernels: Yes

Integrated: No

Support host page-locked memory mapping: No

Compute mode: Default (multiple host threads can use this device simultaneously)

Device 1: "GeForce 9400M"

CUDA Driver Version: 3.0

CUDA Runtime Version: 3.0

CUDA Capability Major revision number: 1

CUDA Capability Minor revision number: 1

Total amount of global memory: 266010624 bytes

Number of multiprocessors: 2

Number of cores: 16

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 8192

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 2147483647 bytes

Texture alignment: 256 bytes

Clock rate: 1.10 GHz

Concurrent copy and execution: No

Run time limit on kernels: Yes

Integrated: Yes

Support host page-locked memory mapping: Yes

Compute mode: Default (multiple host threads can use this device simultaneously)

**printf**to track what is happening in the program. I will try that and see what kind of results I get.

## Monday, May 24, 2010

### [from Josh] x,y,z n-body simulation

This is my 2d or 3d n body simulation with opemMP running parallel for loops (can run several 1000 particles fine on my CPU). I made the simulation in C/C++ and the visuals with openGL. From my interface you can orbit the system using your left mouse button, rotate the axes with the right mouse button, and zoom in and out with the +/- keys. You can also change colour of each particle. Pressing 'r' is supposed to give a color based on particle position (malfunctioning ATM), 'v' for velocity, 'n' for static distribution (to help track particles). In the video it starts off with 'n' and then goes to 'v'.

Using lower G values causes the particles to fly away to quickly and by zooming out you can see just out far the orbit can be with 100.

The particles appear to behave properly and a continued simulation results in sub systems forming and then orbiting each other. It is quite neat to see actually and if I remember I will run one for a few hours with 1000+ particles and take a picture to show you what happens.

I identified a point of possible error for everyone else's simulations. It depends on how you set your code up but in mine I had to be careful when calculating the forces not to use the updated position coords until each particle had been updated.

I have started porting it over to CUDA C. It shouldn't be too hard.

Josh,

## Sunday, May 23, 2010

### [from Pawel] our new machines.

Those of you who want to help, let's meet on Tuesday in UTSC (>10:30)

Maybe we'll try out the newest beta cuda driver 256.25

Everybody else, how are the exercises going? mathGL installed and working?

May is the time for cpu-based computing, June is for gpu-based computing.

I'm looking forward to seeing your cpu-based, working, programs and their descriptions here, in May.

You need to be ready to do cuda on cudak3 initially, later on cudak5 and hopefully cudak4 and cudak1 (I'm repairing it; it will not be on 24/7, but it's a great development platform, with cuda setup similar to cudak3).

To start with, you should reproduce some Dr. Dobbs exercises. Please report successes asap, I imagine by the end of this week.

Reminder - on June 23 we must to be ready with our cuda programs, and some brief desrciption of them will be placed in this blog.

I'd love to see a demo of real-time compressible 2d hydrodynamics (it's difficult but we'll try with Jeffrey), and more than one demos of tau>1 disk calculations, 2d or 3d, done with cuda, using 100 million particles (my guess).

so this is the plan. remember, cuda takes time to learn. help each other, team up if you want, ask me questions.

## Friday, May 21, 2010

### [from James] polar N-body random generation

R=[]

T=[]

i=0

while i < 1000:

j=0

while j < 4:

x=rand.random()

y=rand.random()

r=sqrt(x*x + y*y)

if r <= 1.0:

R.append(r)

T.append(j*pi*0.5 + atan(y/x))

j = j+1

i = i+1

## Thursday, May 20, 2010

### [from James] polar N-body self-gravity

We're calculating the total forces acting on a particle located at (R,θ).

Another particle located at (r,φ) will be a distance d from the first particle.

d

^{2}= R

^{2}+ r

^{2}+ 2*r*R*cos(ψ)

where ψ is angle between the position vectors for the two particles.

The gravitational force between these two particles acts along

**d**; we can split this into the parts acting along

**θ**and

**R**respectively:

F(d)

**d**= F(d)sinβ

**θ**+ F(d)cosβ

**R**

where the bold values are unit vectors and their non-bold counterparts are the corresponding magnitudes (since I don't know how to write those little vector arrow superscripts in html), and β is the angle between

**R**and

**d**,

β = arccos[(d

^{2}+ r

^{2}+ R

^{2})/(2dR)]

Pretty straightforward so far, right?

This force along

**d**produces acceleration along both

**R**and

**θ**:

F(d)

**d**= d

^{2}(θ

**θ**)/dt

^{2}+ d

^{2}(R

**R**)/dt

^{2}

Ok, now working out the derivatives for those accelerations is kinda long-ish, fortunately B&T did it already for the radial acceleration in chapter 3.1. Deriving the angular acceleration by the same method and adding the two gives the total acceleration as

d

^{2}(θ

**θ**)/dt

^{2}+ d

^{2}(R

**R**)/dt

^{2}

= [(d

^{2}θ/dt

^{2})*(R+1) + (dθ/dt)*(2(dR/dt) - θ*(dθ/dt)]

**θ**

+ [(d

^{2}R/dt

^{2}) + θ*(d

^{2}θ/dt

^{2}) - (dθ/dt)

^{2}(R+2)]

**R**

Now combining the above with our force equation near the top, separating the

**R**and

**θ**parts into two equations, and re-arranging, we finally get

d

^{2}θ/dt

^{2}= [F(d)sinβ - 2(dR/dt)(dθ/dt) - θ(dθ/dt)

^{2}] / (R+1)

d

^{2}R/dt

^{2}= F(d)cosβ + θ(d

^{2}θ/dt

^{2}) + (dθ/dt)

^{2}(R+2)

I've double-checked my calculations and obtained the same result, but the presence of d

^{2}θ/dt

^{2}in the d

^{2}R/dt

^{2}equation doesn't really make sense to me. Anyone have any ideas about that?

Of course the above equations only tell us the particle's acceleration due to the gravitational attraction of one other particle in the disk; we still need to add the (similar) forces from all the other particles, plus of course the central mass. Once we do, we'll have a self-gravitating disk.

### [from James] polar 1-body

# 2-body problem in polar co-ordinates. Assume that G*M = 1 and m = 1.

# "t" is the time, "kin" is a list/array of current orbital values passed

# along from the integrator function.

def g(t,kin):

GM = 1.0

r = kin[0] # radial position

theta = kin[1] # angular position

v = kin[2] # radial velocity

omega = kin[3] # angular velocity

E = kin[4] # total energy

L = kin[5] # angular momentum

dr = v

dtheta = omega

dVr= r*omega*omega - GM/(r*r)

domega = -2.0*omega*v/r # careful not to confuse omega and domega when reading below

dE = GM*v/(r*r) + v*dVr + r*v*omega*omega + r*r*omega*domega

dL = 2*r*omega*dr + r*r*domega

# send computed d/dt values back to the integrator

return np.array((dr,dtheta,dVr,domega,dE,dL))

# Initial conditions. Position in (r, theta) and eccentricity will be

# randomly generated. V (radial velocity) will always initially be zero, ie.

# the particle will always begin at apocenter or pericenter. Other values

# are then calculated from r, theta, and eccentricity.

GM = 1.0

X,Y,Eccent = rand.random(),rand.random(),rand.random()

R = sqrt(X*X + Y*Y)

Theta = atan(Y/X) # radians

V = 0.0

Axis = (R*(1.0 + Eccent*cos(Theta)))/(1.0-Eccent*Eccent) # semi-major axis

Omega = (sqrt(GM*Axis*(1.0-Eccent*Eccent)))/(R*R) # rad/time

Energy = -0.5*GM/Axis

AngMomentum = R*R*Omega

I'll send the whole file out via e-mail. If you have python you can run it as an executable, in which case it should display a polar plot of the particle's orbit, and also print out a table of data displaying time, r, theta, E and L for each timestep taken.

I know most of you are working in C/C++, but the actual formulas should work just as well regardless of language.

As for the N-body: the random generator of constant number density particle disk is working fine, and I think I have the polar-coordinate physics for the particle-particle gravitational interactions worked out, but I'm running into a wall on the programming side of things (which makes it hard to test the physics!). I hope to have it working before the weekend.

### [from Pawel] about mathGL

I hope you're working on your mathGL installation and visualization of your

many-particle disks. Yesterday, not everybody showed up so I'll repeat:

/home/pawel/mathgl-1.10 is a directory readable to you; it contains

mathgl-1.10.eng.pdf = the manual for methGL,

mathgl-1.10.tgz is in my /home/pawel/Download directory -

copy it, untar it like so

> tar xvf mathgl-1.10.tgz

and change into the directory that this command creates

> cd mathgl-1.10

it's a small package that you better set up in your directories, reading the installation part of the manual.

In my directory /home/pawel/math-1.10/ I have a sample program in C++ called

sample.cpp, also one in Fortran90 called sample.f90 and so on. You can copy if if you like,

it has only one type of plot uncommented but I tried many of them! Almost all types of graphs and multi-D

plots work perfectly.

One warning: before you install mathGL, you need to have a library gsl-1.13 installed in your system.

I have the tarball in /home/pawel/Download/gsl-1.13.tar.gz and I installed it for everybody to use on cudak3. If you're working on a different system, install it there, and if it's Windows, find another zip file for it. It should be described in the mathGL installation instructions, I'd think.

Those demo plots are fun to make! We'll see how you can use them for disks!

### [from Pawel] Our tasks, part 1

1. precede your post title with [from Author] bit, so we know who's talking.

2. if a post is a continuation of something previously discussed by you, give it a number like

[from Anthony] Particle motion, part 3. that will really help to find our way around here.

3. If you update your earlier post, which you shold do if you wish so, note that like (updated 01-01-2011)

at the end of the post title.

* * *

now, to the first point I want to talk about: simulations. I would like all of you to have minimal experience in particle dynamics before we move on, and I have some evidence that Jeffrey (experienced in this) and Anthony and perhaps Josh (though I'd like to see his resuklts posted here and shown to us in real-time openGL one day) are already close to being able to move to the next level. I know Robert was busy with hardware/system software as was Josh, so I can understand. Still, this is not so complicated and I expect that you'll post. James, are you going to post your 1-Body simulation here soon?

If any of you want to team up with others, please do so, dividing tasks between yourself so you know exactly everythign about what the other(s) have done, and report together.

on level 2, we will improve and build out your simulations to include physical interactions such as radiation pressure with shadow casting. Any ideas on how to do this? Obviously, we need to synchronize the calculations for all the particles so the forces they exert are computed at the same times. It is possible to

multi-time the calculations and have let's say most particles for which we update forces at intevals dt,

and the rest which don't need it that often, and hence are updated in force only every (2*dt).

It's not necessary and it would complicate our life in this instance.

You have to device a scheme for optical thickness comptation at each particle's location.

I think you need to youse a fixed Eulerian polar grid of points, enough of then so you have at least as much cells as particles. Each particle will contribute a certain contribution dtau to tau, and that will have to vary as 1/r^2. Let's see why. dtau by definition is dH/H, or relative increase/decrease of flux of radiation coming from the star. Radiation transfer equation is describing the change of flux in a parallel beam of radiation reads

dH = -H d(tau)

where -dH is the absorbed/scattered flux.

However, our beam of radiation is spreading since we have a spherical symmetry, so it's better to

multiply both sides of this equation by 4pi r^2, and call L the total flux or total luminosity of the radiation *going out radially* (not scattered or absorbed yet). Then

dL = -L d(tau)

and now we have the comfortable result that if 0% of radiation is scattered/absorbed in a layer (optical thickness dtau=0), then the total luminosity at all radii is constant and equal L=L_* (stellar luminosity).

from this, dtau due to one particle is simply the fraction of particle cross section pi*s^2 to the

sphere of radius r (r=distance from the star, s = radius of the dust particle),

dtau = pi s^2/ (4 pi r^2) = (1/4) (s/r)^2

a good way to handle the calculation is to have a procedure for evaluation of a map of dtau asa function of (r, theta) coordinates. I would take 360 cells in theta and 1000 in radius between r=0.1 (or so) to r=3 or so.

I assume that inital ring of particles will be chosen between r=1 and r=2. If there are a billion instead of million particles, I'd take 3600 cells in angle and 10000 in radius, for a total of 0.36e8 cells. So ok, there will be on average >1 particle per cell, that's fine, as long it's not >>1.

After that, you integrate along radial paths

tau(r,theta) = integral _0 ^r dtau(r,theta),

which can be though of as adding consecutive dtau values to the running sum called tau, going from

inner to outer radii, separately for every theta. integral^r means integral with upper limit r.

On the first usage of this routine, you set A=1 and dtau = A/r^2, neglecting all the precise constants that would be needed. That is done, because no matter what s^2 we take and how accurate with constants we are,

we simply don't know and don;t want to deal with the number of particles of dust.

We prefer to specify global properties of the disk: as we travere the disk in its midplane, the total tau

or tau(infinity), should accumulate to a given number, which is our input parameter for the simulation.

So in the first pass we divide the **average** (over theta) tau(r=r_outer) by the wanted final tau_infinity,

and store that number as the average optical thickness parameter per particle, A. Then I'd recalculate

the tau map with the updated A.

I'd recommend to try the following choices of radial optical depth if a disk:

tau_infinity = 0, 2, 4, 12

* * *

when the table of dtau is ready and the integrated tau(r,theta) is ready, you can then compte the table of

beta = F_rad/F_grav. that parameter, as you know, does depend on the position only via tau:

beta = beta0 * exp(-tau)

since, however, exp(-tau) happens to be the solution of differential equation

d(tau) = - tau * dn

where dn=1 is by how much the index of the cell jumps from one cell to another, you might as well

save the computer some time on evaluation of math function exp() like so:

you set the initial value of function f, which is your numerical approximation to exp(-tau), to 1 at the inner radius r0. Moving out from one cell to another, you multiply the previous value of f by (1-dtau) in the current cell, store the result on move on to the outer radius.

Why? well, exp(-tau) = exp (-integral^r dtau) = product over all cells of factors exp(-dtau), because

an exp of a sum is a product of the exp's.

and that is very nearly equal to the product of 1-dtau factors, because exp(-dtau) = 1 - dtau/1! + dtau^2/2! - dtau^3/3! +...., and our dtau <<1 (optical thickness of one cell <<1).

### [from Josh] Next Step

### [from Josh] New Office

## Wednesday, May 19, 2010

### [from Anthony] Remarks, continued

## Tuesday, May 18, 2010

### Anthony Reporting...

#include <iostream>

#include <cmath>

#include <cstdlib>

#include <fstream>

using namespace std;

int main ()

{

ofstream dataFile("2body.data");

double r,ax,ay,E,a;

double x = 1; //initial conditions

double y = 0;

double vx = 0;

double vy = 1;

double dt=0.5;

double t = 0;

for (int step =0; t<60 ;step ++) //integrator

{

r = sqrt(x*x + y*y);

ax = -x/(r*r*r);

ay = -y/(r*r*r);

vx += ax * dt ;

vy += ay * dt ;

x += vx * dt ;

y += vy * dt ;

t += dt;

dataFile x << '\t' << y << '\n';//writes data

}

return 0;

}

#include <iostream>

#include <cmath>

#include <cstdlib>

#include <fstream>

using namespace std;

int main ()

{

ofstream dataFile("nbody.data");

int m,n= 4;

double x[n][m],y[n][m],vx[n][m],vy[n][m],ax[n][m],ay[n][m],r[n][m],dt;

dt = 0.05;

srand(time(0));

for (int i=0; i

{

x[0][i]=0.5+(rand()/(RAND_MAX+1.0))*2.5;

y[0][i]=0;

vx[0][i]=0;

vy[0][i]=sqrt(1/x[0][i]);

double t=0;

for (int j=1; t<40; j++)

{ r[j-1][i]=sqrt(x[j-1][i]*x[j-1][i]+y[j-1][i]*y[j-1][i]);

ax[j-1][i]=-x[j-1][i]/(r[j-1][i]*r[j-1][i]*r[j-1][i]);

ay[j-1][i]=-y[j-1][i]/(r[j-1][i]*r[j-1][i]*r[j-1][i]);

vx[j][i] = vx[j-1][i]+(ax[j-1][i])*(dt);

vy[j][i] = vy[j-1][i]+(ay[j-1][i])*(dt);

x[j][i] = x[j-1][i]+vx[j-1][i]*dt;

y[j][i] = y[j-1][i]+vy[j-1][i]*dt;

dataFile x[j][i] << '\t' << y[j][i] << '\n';//Need to find another way to write the data properly

t += dt;

}

}

dataFile.close();

return 0;

}

## Monday, May 17, 2010

### How to install openGL to get graphics

- Put freeglut.h in: ‘C:\Program Files\Microsoft Visual Studio 9.0\VC\include\GL\’ (note: you'll have to create the GL folder)
- Put freeglut_ext.h in: ‘C:\Program Files\Microsoft Visual Studio 9.0\VC\include\GL\’
- Put freeglut_std.h in: ‘C:\Program Files\Microsoft Visual Studio 9.0\VC\include\GL\’
- Put freeglut.lib in: ‘C:\Program Files\Microsoft Visual Studio 9.0\VC\lib\’
- Put freeglut.dll in: ‘C:\WINDOWS\system32\’ (SysWOW64 folder for vista or win7 64 bit users)
- Put glut32.dll in: ‘C:\WINDOWS\system32\’ (SysWOW64 folder for vista or win7 64 bit users)
- Put glut32.lib in: ‘C:\Program Files\Microsoft Visual Studio 9.0\VC\lib\’
- Put glut.h in: ‘C:\Program Files\Microsoft Visual Studio 9.0\VC\include\GL\’