The Dude!

Wednesday, June 17, 2009

GE Advanced Algorithms Scientist

Job Number	1051555
Business	GE Technology Infrastructure
Business Segment	Technology Infrastructure - Healthcare
Posted Position Title	Advanced Algorithms Scientist
Career Level	Experienced
Function	Engineering/Technology
Function Segment	Research - Physics
Location	United States
Relocation Expenses	Partial Expenses
Role Summary/Purpose	The Advanced Algorithms Scientist develops innovative, advanced and robust algorithms for tomographic image reconstruction
Essential Responsibilities	·Develops innovative, advanced and robust algorithms for tomographic image reconstruction. ·Works with cross-functional scientific, engineering, and integration teams to deliver high performance, reliable algorithms for implementation on dedicated signal processing architectures. ·Works with scientists and application specialists to improve existing algorithms and to introduce new innovative approaches for image reconstruction. ·Designs, develops, and maintains data processing techniques and structures that will be supported over a wide range of signal acquisition and processing hardware platforms. ·Performs troubleshooting, problem solving analysis, and performance enhancements for imaging products. ·Prepares technical documentation and design reviews. ·Assists in the planning and design of the next generation of high performance reconstruction engines for future products.
Qualifications/Requirements	·Master’s degree in one of the following Electrical Engineer with focus on image reconstruction and/or signal processing, Computer Science, Computer Engineering. Biomedical. Engineer with focus math or physics emphasis, or Applied Mathematics ·5+ years industry experience in algorithm development and/or signal processing or 5+ years active research in CT reconstruction ·Good communication skills in a team environment ·Experience with numerical analysis, optimization and stochastic processes ·Experience with analysis and simulation software (e.g. MATLAB/IDL). ·Experience in technical or scientific writing. ·Proven ability to drive resolution of challenging problems and deliver timely and effective design solutions **Work location is flexible within the US
Additional Eligibility Qualifications	For U.S. employment opportunities, GE hires U.S. citizens, permanent residents, asylees, refugees, and temporary residents. Temporary residence does not include those with non-immigrant work authorization (F, J, H or L visas), such as students in practical training status. Exceptions to these requirements will be determined based on shortage of qualified candidates with a particular skill. GE will require proof of work authorization. Any offer of employment is conditioned upon the successful completion of a background investigation and drug screen.
Desired Characteristics	·PhD Degree in Electrical Engineer with focus on image reconstruction and/or signal processing, Computer Science, Computer Engineering. Biomedical. Engineer with focus math or physics emphasis, or Applied Mathematics ·Experience with deterministic and statistical tomographic image reconstruction algorithms ·Demonstrated record of innovation in development ·Excellent oral and written communication skills ·Experience with research collaboration ·Self-starter, energizing, results oriented, and able to multi-task ·Demonstrated problem solving ability and results orientation ·Demonstrated ability to work in a collaborative, matrixed, and customer focused environment ·Excellent communication, influencing skills and ability to gain buy-in for initiatives

Sunday, May 10, 2009

Binary mode v.s. Text mode

1.1 Newline
a newline, also known as a line break or end-of-line (EOL) character, is a special character or sequence of characters signifying the end of a line of text.
Software applications and operating systems usually represent a newline with one or two control characters:

Systems based on ASCII or a compatible character set use either LF (Line feed, 0x0A) or CR (Carriage Return, 0x0D) individually, or CR followed by LF (CR+LF, 0x0D 0x0A); These characters are based on printer commands: The line feed indicated that one line of paper should feed out of the printer, and a carriage return indicated that the printer carriage should return to the beginning of the current line.

LF: Multics, Unix and Unix-like systems (GNU/Linux, AIX, Xenix, Mac OS X, FreeBSD, etc.), BeOS, Amiga, RISC OS, and others
CR+LF: DEC RT-11 and most other early non-Unix, non-IBM OSes, CP/M, MP/M, DOS, OS/2, Microsoft Windows, Symbian OS

The C programming language provides the escape sequences '\n' (newline) and '\r' (carriage return). However, these are not required to be equivalent to the ASCII LF and CR control characters. The C standard only guarantees two things:

Each of these escape sequences maps to a unique implementation-defined number that can be stored in a single char value.
When writing a file in text mode, '\n' is transparently translated to the native newline sequence used by the system, which may be longer than one character. When reading in text mode, the native newline sequence is translated back to '\n'. In binary mode, the second mode of I/O supported by the C library, no translation is performed, and the internal representation of any escape sequence is output directly.

On Unix platforms, where C originated, the native newline sequence is ASCII LF (0x0A), so '\n' was simply defined to be that value. With the internal and external representation being identical, the translation performed in text mode effectively turns into a no-op, making text mode and binary mode behave the same. This has caused many programmers who developed their software on Unix systems simply to ignore the distinction completely, resulting in code that is not portable to different platforms.

1.2

Common problems

The different newline conventions often cause text files that have been transferred between systems of different types to be displayed incorrectly. For example, files originating on Unix or Apple Macintosh systems may appear as a single long line on a Windows system. Conversely, when viewing a file from a Windows computer on a Unix system, the extra CR may be displayed as ^M at the end of each line or as a second line break.

The problem can be hard to spot if some programs handle the foreign newlines properly while others don't. For example, a compiler may fail with obscure syntax errors even though the source file looks correct when displayed on the console or in an editor. On a Unix system, the command cat -v myfile.txt will send the file to stdout (normally the terminal) and make the ^M visible, which can be useful for debugging. Modern text editors generally recognize all flavours of CR / LF newlines and allow the user to convert between the different standards. Web browsers are usually also capable of displaying text files of different types.

1.3 Open modes in C++

You can control the way a file is opened by overriding the constructor’s default arguments. The following table shows the flags that control the mode of the file:

ios::in
Opens an input file. Use this as an open mode for an ofstream to prevent truncating an existing file.

ios::out
Opens an output file. When used for an ofstream without ios::app, ios::ate or ios::in, ios::trunc is implied.

ios::binary
Opens a file in binary mode. The default is text mode.

You can combine these flags using a bitwise or operation. The binary flag, while portable, only has an effect on some non-UNIX systems, such as operating systems derived from MS-DOS, that have special conventions for storing end-of-line delimiters. For example, on MS-DOS systems in text mode (which is the default), every time you output a newline character ('\n'), the file system actually outputs two characters, a carriage-return/linefeed pair (CRLF), which is the pair of ASCII characters 0x0D and 0x0A. Conversely, when you read such a file back into memory in text mode, each occurrence of this pair of bytes causes a '\n' to be sent to the program in its place. If you want to bypass this special processing, you open files in binary mode. Binary mode has nothing whatsoever to do with whether you can write raw bytes to a file—you always can (by calling write( )) . You should, however, open a file in binary mode when you’ll be using read( ) or write( ), because these functions take a byte count parameter. Having the extra '\r' characters will throw your byte count off in those instances. You should also open a file in binary mode if you’re going to use the stream-positioning commands discussed later in this chapter.

1.4 Conclusion

The representation of text files varies among operating systems. For example, the end of a line in a UNIX environment is represented by the linefeed character '\n'. On some other systems, such as Microsoft Windows, the end of the line consists of two characters, carriage return '\r' and linefeed '\n'. The end of the file differs as well on these two operating systems. Peculiarities on other operating systems are also conceivable.

To make programs more portable among operating systems, an automatic conversion can be done on input and output. The carriage return or linefeed sequence, for example, can be converted to a single '\n' character on input; the '\n' can be expanded to "\r\n" on output. This conversion mode is called text mode, as opposed to binary mode. In binary mode, no such conversions are performed.

The mode flag std::ios_base::binary has the effect of opening a file in binary mode. This has the effect described above; in other words, all automatic conversions, such as converting "\r\n" to '\n', are suppressed. [Basically, the binary mode flag is passed on to the respective operating system's service function, which means that in principle all system-specific conversions are suppressed, not only the carriage return / linefeed handling.]

If you must process a binary file, you should always set the binary mode flag, because most likely you do not want any kind of implicit, system-specific conversion performed.
The effect of the binary open mode is frequently misunderstood. It does not put the inserters and extractors into a binary mode, and hence suppress the formatting they usually perform. Binary input and output is done solely by basic_istream<>::read() and basic_ostream<>::write().

Monday, February 23, 2009

Convert images sequence from one format to another

for i = 1:942

filename_in = sprintf('/home/gaipaul/Desktop/data/logitech/%08d.jpg',i);
I = imread(filename_in);
Ig = rgb2gray(I);
Ig = imresize(Ig,0.3);
filename_out = ['/home/gaipaul/Desktop/data/logitech/gray/' int2str(i) '.pgm'];
imwrite(Ig,filename_out);

end

Smooth scrolling - Emacs hackery

Emacs mouse wheel scrolling can be abrupt or "jumpy" and cause one to lose their tracking. This is a hack that rebinds the mouse wheels to some functions that scroll one line at a time, pausing for a slight delay between each scroll. The times are tweakable to achieve different effects (like a scroll that slows down as it nears the end of its duration):

(defun smooth-scroll (increment)
(scroll-up increment) (sit-for 0.05)
(scroll-up increment) (sit-for 0.02)
(scroll-up increment) (sit-for 0.02)
(scroll-up increment) (sit-for 0.05)
(scroll-up increment) (sit-for 0.06)
(scroll-up increment))

(global-set-key [(mouse-5)] '(lambda () (interactive) (smooth-scroll 1)))
(global-set-key [(mouse-4)] '(lambda () (interactive) (smooth-scroll -1)))

A more generic function is as follows, though it cannot pause for variable lengths of time. You could use this if you want to more easily change the number of lines scrolled:

(defun smooth-scroll (number-lines increment)
(if (= 0 number-lines)
t
(progn
(sit-for 0.02)
(scroll-up increment)
(smooth-scroll (- number-lines 1) increment))))

(global-set-key [(mouse-5)] '(lambda () (interactive) (smooth-scroll 6 1)))
(global-set-key [(mouse-4)] '(lambda () (interactive) (smooth-scroll 6 -1)))

I have only tested these on GNU Emacs (version 23.0.60). If they do not work on XEmacs, I would appreciate any tips you could send me. Write me at "dzwell at [this domain]".

Wednesday, February 11, 2009

Making movies from image files using ffmpeg/mencoder

+convert all our images to jpeg's
for f in *ppm ; do convert -quality 100 $f `basename $f ppm`jpg; done

+encode the images files into a movie using either mencoder or ffmpeg
 mencoder "mf://*.jpg" -mf fps=10 -o test.avi -ovc lavc
                             -lavcopts vcodec=msmpeg4v2:vbitrate=800
 ffmpeg -r 10 -b 1800 -i %03d.jpg test1800.mp4

Thursday, February 5, 2009

Using gdb/ddd to debug child processes

If you have tried to debug a child process using ddd, you may have noticed that ddd steps into the parent (and not the child) after the call to fork(). It is possible to debug the child as well, but it requires a special procedure. Since the child is a seperate process, it will require a second debugger window, and we will make use of gdb's ability to "attach" to a process which is already running.

Before you start, you must do the following:

Make sure your call to fork() assigns a value to some variable so you can read it easily, e.g. "pid = fork()".
Make sure you place a sleep() statement in the child as the first line of code after the fork(), e.g. "sleep(60)" [make the sleep() long enough for step 4 below ...]. The sleep() statment can be removed once debugging is complete.
Compile your program with the "-g" option set, e.g. gcc myProg.c -o myProg -g

Now you are ready to start:

Start 2 copies of ddd in the background, like "ddd myProg & ddd myProg &". It is important that the two copies being running concurrently.
Pick (arbitrarily) one window to be the "parent" and set a breakpoint after the call to fork() (but not in any code the child will be executing ... that is, set the breakpoint somewhere in the parent's code ... if you set the breakpoint in the child's code, DDD will kill the child as soon as it is created!).
Run the parent to the breakpoint. Note the value returned by fork(), i.e. the process ID of the child.
In the "child" window, type "attach " in the gdb console window where is the child's process ID. Note: the gdb console is found at the bottom of the ddd window; this is where you can type commands directly to gdb.
Set a breakpoint in the child after the sleep() statement, and click on "cont" (in the popup "Command Tool" window) to allow the child to continue execution to the breakpoint.

Tuesday, February 3, 2009

Shared libraries and static libraries

3.2 Shared libraries and static libraries

Although the example program above has been successfully compiled and linked, a final step is needed before being able to load and run the executable file.

If an attempt is made to start the executable directly, the following error will occur on most systems:

$ ./a.out
./a.out: error while loading shared libraries:
libgdbm.so.3: cannot open shared object file:
No such file or directory

This is because the GDBM package provides a shared library. This type of library requires special treatment--it must be loaded from disk before the executable will run.

External libraries are usually provided in two forms: static libraries and shared libraries. Static libraries are the ‘.a’ files seen earlier. When a program is linked against a static library, the machine code from the object files for any external functions used by the program is copied from the library into the final executable.

Shared libraries are handled with a more advanced form of linking, which makes the executable file smaller. They use the extension ‘.so’, which stands for shared object.

An executable file linked against a shared library contains only a small table of the functions it requires, instead of the complete machine code from the object files for the external functions. Before the executable file starts running, the machine code for the external functions is copied into memory from the shared library file on disk by the operating system--a process referred to as dynamic linking.

Dynamic linking makes executable files smaller and saves disk space, because one copy of a library can be shared between multiple programs. Most operating systems also provide a virtual memory mechanism which allows one copy of a shared library in physical memory to be used by all running programs, saving memory as well as disk space.

Furthermore, shared libraries make it possible to update a library without recompiling the programs which use it (provided the interface to the library does not change).

Because of these advantages gcc compiles programs to use shared libraries by default on most systems, if they are available. Whenever a static library ‘libNAME.a’ would be used for linking with the option -lNAME the compiler first checks for an alternative shared library with the same name and a ‘.so’ extension.

In this case, when the compiler searches for the ‘libgdbm’ library in the link path, it finds the following two files in the directory ‘/opt/gdbm-1.8.3/lib’:

$ cd /opt/gdbm-1.8.3/lib
$ ls libgdbm.*
libgdbm.a  libgdbm.so

Consequently, the ‘libgdbm.so’ shared object file is used in preference to the ‘libgdbm.a’ static library.

However, when the executable file is started its loader function must find the shared library in order to load it into memory. By default the loader searches for shared libraries only in a predefined set of system directories, such as ‘/usr/local/lib’ and ‘/usr/lib’. If the library is not located in one of these directories it must be added to the load path.⁽¹⁰⁾

The simplest way to set the load path is through the environment variable LD_LIBRARY_PATH. For example, the following commands set the load path to ‘/opt/gdbm-1.8.3/lib’ so that ‘libgdbm.so’ can be found:

$ LD_LIBRARY_PATH=/opt/gdbm-1.8.3/lib
$ export LD_LIBRARY_PATH
$ ./a.out
Storing key-value pair... done.

The executable now runs successfully, prints its message and creates a DBM file called ‘test’ containing the key-value pair ‘testkey’ and ‘testvalue’.

To save typing, the LD_LIBRARY_PATH environment variable can be set automatically for each session using the appropriate login file, such as ‘.bash_profile’ for the GNU Bash shell.

Several shared library directories can be placed in the load path, as a colon separated list DIR1:DIR2:DIR3:...:DIRN. For example, the following command sets the load path to use the ‘lib’ directories under ‘/opt/gdbm-1.8.3’ and ‘/opt/gtk-1.4’:

$ LD_LIBRARY_PATH=/opt/gdbm-1.8.3/lib:/opt/gtk-1.4/lib
$ export LD_LIBRARY_PATH

If the load path contains existing entries, it can be extended using the syntax LD_LIBRARY_PATH=NEWDIRS:$LD_LIBRARY_PATH. For example, the following command adds the directory ‘/opt/gsl-1.5/lib’ to the load path shown above:

$ LD_LIBRARY_PATH=/opt/gsl-1.5/lib:$LD_LIBRARY_PATH
$ echo $LD_LIBRARY_PATH
/opt/gsl-1.5/lib:/opt/gdbm-1.8.3/lib:/opt/gtk-1.4/lib

It is possible for the system administrator to set the LD_LIBRARY_PATH variable for all users, by adding it to a default login script, such as ‘/etc/profile’. On GNU systems, a system-wide path can also be defined in the loader configuration file ‘/etc/ld.so.conf’.

Alternatively, static linking can be forced with the -static option to gcc to avoid the use of shared libraries:

$ gcc -Wall -static -I/opt/gdbm-1.8.3/include/
   -L/opt/gdbm-1.8.3/lib/ dbmain.c -lgdbm

This creates an executable linked with the static library ‘libgdbm.a’ which can be run without setting the environment variable LD_LIBRARY_PATH or putting shared libraries in the default directories:

$ ./a.out
Storing key-value pair... done.

As noted earlier, it is also possible to link directly with individual library files by specifying the full path to the library on the command line. For example, the following command will link directly with the static library ‘libgdbm.a’,

$ gcc -Wall -I/opt/gdbm-1.8.3/include
   dbmain.c /opt/gdbm-1.8.3/lib/libgdbm.a

and the command below will link with the shared library file ‘libgdbm.so’:

$ gcc -Wall -I/opt/gdbm-1.8.3/include
   dbmain.c /opt/gdbm-1.8.3/lib/libgdbm.so

In the latter case it is still necessary to set the library load path when running the executable.