Yes use geographical coordinates longitude and latitude for Google/Bing Maps.
When the world file is present, ArcGIS performs the image-to-world transformation. The image-to-world transformation is a six-parameter affine transformation in the form of
The transformation parameters are stored in the world file in this order:
20.17541308822119 (A)
0.00000000000000 (D)
0.00000000000000 (B)
-20.17541308822119 (E)
424178.11472601280548 (C)
4313415.90726399607956 (F)
The equation:
x1 = Ax + By + C
y1 = Dx + Ey + F
where
x1 = calculated x-coordinate of the pixel on the map
y1 = calculated y-coordinate of the pixel on the map
x = column number of a pixel in the image
y = row number of a pixel in the image
A = x-scale; dimension of a pixel in map units in x direction
B, D = rotation angle
C, F = translation parameters; x,y map coordinates of the center of the upper left pixel
E = negative of y-scale; dimension of a pixel in map units in y direction
For Images to be overlayed with Bing Maps (MapCruncher should be used.
)
for Google Maps use
http://mapki.com/wiki/Add_Your_Own_Custom_Map
Both use Mercator Projection see parameters for Google Maps
useful links
http://econym.org.uk/gmap/custommap.htm
There are many conventions for image world files. What they share is the image-to-world transformation matrix. (Where they differ is in how the matrix elements are represented and in how the pixels are referenced: more about that at the end.)
In almost all cases the world file represents an affine matrix (not the full projective matrix). The affine matrix has six coefficients, often written in 3 x 3 matrix form as
A B C
D E F
0 0 1
Fortunately, you don't need to do much arithmetic to decipher this, nor do you need to learn any more math than you already know, because there is a nice simple interpretation of what this matrix means. It says that:
The point at (0,0) in the intrinsic image coordinates (usually column and row) corresponds to the point (C,F) in the world. These coodinates come from the third (rightmost) column of the matrix.
The point at (1,0) corresponds to (A,D) + (C,F) = (A+C, D+F). This is the sum of columns 3 and 1.
The point at (0,1) corresponds to (B,E) + (C,F) = (B+C, E+F). This is the sum of columns 3 and 2.
To create a world file, then, you have to work out the values of A ... F. But the solution is easy:
(C,F) are the coordinates of where you would like (0,0) (the image's origin) to be.
(A,D) are obtained by subtracting (C,F) from the coordinates of where you would like (1,0) to be. (This is often the second pixel along the first row.)
(B,E) are obtained by subtracting (C,F) from the coordinates of where you would like (0,1) to be. (This is often the second pixel along the first column.)
For example, if you would like to rotate the image 90 degrees counterclockwise around its origin, place the origin at the point (100, -200), and scale each pixel up by 30, you can easily work out (by drawing a picture, for instance) that
(0,0) should wind up at (100,-200), so (C,F) = (100,-200).
(1,0) should wind up at (100, 30-200), so (A,D) = (0,30). (Reason: rotating (1,0) sends it to (0,1); scaling up by 30 gives (0,30); translating by (100,-200) gives (100,30-200).)
(0,1) should wind up at (-30+100, -200) so (B,E) = (-30,0). (Reason: rotating (0,1) sends it to (-1,0), etc.)
You can still get it wrong, depending on how your software intrinsically references the pixels in the image. The principal variations are (i) rows can go from top to bottom instead of bottom to top, placing the image origin in the upper left corner and (ii) the image could be referenced by (row, column) instead of (column, row). The first one causes the image to be reflected across a horizontal line while the second one reflects it across a diagonal line. If you get either of those two errors, you will immediately see what's going on and will know what to fix. Documentation can be so obtuse (or non-existent) that in practice I just try it, deduce the convention from how the image turns out, and proceed accordingly.
Note that rotations by non-multiples of 90 degrees and differential rescaling require resampling and therefore are not universally supported. This frequently limits the allowable matrices to those with B = D = 0 and |A| = |E| or A = E = 0 and |B| = |D|.
Best Answer
Assuming a zero-based rotation factor for x and y, here you need the png file image dimensions in order to derive x and y pixel size.
Rudimentary example: