Skip to content

Thoughts & Musings Posts

Life is like Pitching for VC Funding

Sunset in Arizona, on our way back from Chiricahua National Monument Copyright: Asif Rehan

Since early on, I was a hardworking student. So my parents continued investing in my education. When I was in grade 8, I convinced my parents to let me go to Dhaka Residential Model College for grade 9-10 because I thought my father will get transferred for his job away from the capital city and the family will have to move as well. I anticipated that changing schools at this time will hamper my Secondary School Certificate exam, which was the first crucial academic exam for a student’s career in Bangladesh. It was a very expensive school to be in also the admission test was tough. Anyway, I got the admission form and took the admission test studying all alone. I was scared I will not be able to compete against the more well prepared and tutored competitors to get one of those limited seats. I did not even have a tutor outside of school when I was in grade 8. But I made it in the test. I learned a very important lesson in my life: Limited resources does not always mean limited capabilities. After the admission, my parents paid the hefty admission fees. I still remember it was around BDT10390. For comparison BDT 10390 is around USD125. But my public school in grade 8 only cost us around BDT 25 in 2002. Another lesson learned: if you can show credibility and competence, you get investments from people from whom you have earned the trust.

Fast forward, I got through undergrad with very minimal expense at the apparently most prestigious university in Bangladesh. Since I proved my potential through the super selective admission test, I had it for almost free. Same lesson applies: show the potential, people will invest in the future.

Up next, my professor at UConn invested in my MS degree studies because he believed I would be able to produce some research outcomes for his project. So I had the amazing opportunity to work as a Graduate Research Assistant and my tuition was covered and also I received the monthly stipend. It would be quite expensive to study in the USA. Same lesson: if you can convince that future is amazing, you can marshal resources.

Right now my current employer Ford Motor Company is providing the tuition for my second MS. My former boss believed I can combine domain knowledge in Mobility, Data Science and Computer Science to build amazing analytic software in the future. Again, I do not have to pay the tuition fees. Convincing matters.

So, now what about work and career? This is the phase I am working on right now. I am building my technical skills and developing people skills. This part is about building the competence and credibility. Up next, I will need to develop some proposals for some projects which has the potential to create an outsize impact in the future. It will take insight, foresight, and research. At this point, it will require convincing the company to gain resources including a team and a budget, and you know the formula by now: pitch for the future with competence and credibility, then resources will never be a problem.

So this is my formula for becoming a change maker, leader, entrepreneuer/intrapreneur, whatever you name it. But this is the only formula I learned from my life. Even for personal life, my beautiful and extremely supportive wife somehow had to be convinced that our future together will be significantly better than our futures otherwise. So that is also kind of a soft-pitch.

I don’t know yet if my idea is future-proof, but it worked out pretty well so far! If we think like this, we have all been running our own startups: ourselves, and all of us have been pitching for VC funding throughout our lives!

Leave a Comment

Distributed/Big Data Geospatial Processing Tools

Work-in-progress. I will write more about each approach later in details.

Just summarizing the tools for connecting to Hadoop and running geospatial processing on a large dataset. I am working on a ~100 GB Hive Table which is just a small subset of the original dataset

  1. http://geospark.datasyslab.org/
  2. https://pypi.org/project/geopyspark/
  3. https://github.com/Esri/gis-tools-for-hadoop/wiki
  4. Kinetica GPU Database – Graph solver and Match solver
  5. PySpark python libraries
  6. Spatial Hadoop
  7. Alteryx – Using Connect-in-DB function to connect to Hadoop
Leave a Comment

My Application to OMSCS

I applied for Georgia Tech’s OMSCS program in 2018 and started my first semester in January 2019. Since I have been into the program for over a year, I felt sometimes that I should share my admission essays. After this Spring 2020 semester, I will be halfway there to graduate in terms on credits. So I want to share the essays at this point of my new academic journey for my own reflection, and hopefully secure some accountability to keep pushing towards the goals I have set for myself. It has been taking a great effort to balance my job as a Data Scientist at Ford Motor Company and studies going on in parallel. Thanks to my loving wife, Mou for all the support!

Now without any further ado, you can read the two essays right below. These were written in 2018. So the contents here reflect that time.

Career Objectives and Background Essay – “What has prepared you for this program?”

Please describe your background (academic and extracurricular) and experience, including research, teaching, industry, and other relevant information. 

Coming from a Transportation Engineering background, I learned C++ in college. But only much later as a Graduate Research Assistant at UConn, I applied programming skills in Python to build a prototype to estimate route level travel time from GPS data. For this, I implemented a Hidden Markov Model-based map-matching algorithm using Kernel Ridge Regression and Viterbi algorithm. Taking Applied Statistics courses and achieving 4.3 out of 4 in the Neural Networks course at UConn prepared me to handle challenging mobility problems. I also taught myself Java, implemented data structures and algorithms in Java, and studied Complexity Theory using Princeton, MIT, and University of Arizona course lectures available online.

While working at Metropia as a Research Engineer, I had a deep exposure to software and web architecture. There I actually covered many roles: researched new algorithms, developed software, discussed backend architecture, set up and maintained AWS servers and databases, and created dashboards using SQL. I researched and developed an algorithm to predict delay at traffic intersections and integrated with the real-time routing API. Eventually, I also developed a parameter optimization tool in Python that worked across a few AWS servers simultaneously improving routing prediction by a few percentage points. I developed an API predicting El Paso-Juarez border crossing time using GPS data from our app users. It involved building web-crawlers, statistical modeling, Neural Net regression, and a real-time reporting dashboard.

Since I have joined Ford as a Data Scientist, I am leading end-to-end development and commercialization of a new analytic product. I developed the algorithms, and the data pipeline by exploiting QuadTree for geospatial sampling and multithreaded API calls to collect 2 million data for the prediction model. Currently, I am working with the team to scale it globally.

With my experience in applying Computer Science, I am well prepared for this program.

Statement of purpose

Please give a Statement of Purpose detailing your academic and research goals as well as career plans. Include your reasons for choosing the College of Computing as opposed to other programs and/or other universities. 

In summer 2014, while I was in my previous Master’s program in Civil Engineering at UConn, I was mostly coding in Python for my graduate research project and was reading papers from Computer Science (CS) journals frequently. Later during the summer of 2016, I was working at a startup, Metropia. By then, I was able to build many algorithms and software tools using my programming skills but I realized that I lacked many fundamental CS concepts which I have been teaching myself using online resources. Neither it is efficient, nor it leads to any university endorsement for my skills. So my conviction to obtain a formal masters in CS has been growing stronger with time since then.

On the other hand, it is not realistic to leave my full-time job for graduate school either. I heard about the OMSCS program at the beginning of 2018. Right at that time, I joined Ford Motor Company as a Data Scientist. At Ford, I realize that the tools I build can be better if I have a deeper understanding of CS concepts. Fortunately, my manager agreed and he encouraged me to pursue OMSCS and Ford would support me with tuition fees. Three current colleagues at Ford enrolled in OMSCS program provided great positive reviews about the program and mentioned it is a rigorous program. Given the highly respected top-ranked Graduate CS program at Georgia Tech and its pioneering faculties, I intend to work hard, learn the most, and apply it to my job and career.

From my background in Mobility and Transportation, I find mobility problems heavily rely on CS theory and techniques. For example, in my work, I used parallel programming, a code profiler to optimize memory at runtime, and also developed APIs. But I want to go deeper and learn better ways to do it. I would like to take the opportunity of curated courses at OMSCS.  I have come quite far by teaching myself CS skills, and I would like to get the missing CS degree to accelerate forward.

My career plan for the next 8-10 years is divided into three phases. In the short term, while I am enrolled in the degree, I would like to leverage my coursework to develop better analytic software products on my job. My focus will be to learn as much as possible about software engineering and build scalable machine learning tools. Courses like Graduate Algorithms and Machine Learning will be useful for this purpose. Also, the system design knowledge from courses like Software Development Process and Software Architecture and Design will be useful as well. I intend to choose machine learning specialization for my degree. I believe my current and immediate career trajectory will have many projects to apply all the learnings from the degree. 

For the next 1-3 years after graduation, I would like to be able to lead analytic software development projects with my technical leadership. With a degree from Georgia Tech, I envision that my skills will be valued, and I will be leading a technical team. 

Finally, 3-5 years after my graduation, I would like to be able to architect full-stack analytic software. This step will need more time for me to get enough experience. At this phase, I believe I will have enough exposure to the industry to find new product ideas and be a product manager. I intend to obtain an MBA to develop skills for running a company. I imagine I will be either leading a team within a large company or establishing my own that will develop software for emerging autonomous vehicles. 

With my current role at Ford, I see the emerging new autonomous technologies closely and my transportation engineering background will be immensely valuable in creating new software products for the new upcoming transportation ecosystem. In the future, I may stay at my current company at Ford or move to other companies like General Motors, Google, Uber or Lyft, or maybe start my own. But with the OMSCS degree from Georgia Tech, I will be better able to progress in my career trajectory as I have outlined above.

Leave a Comment

How I added SSL Certificate to my personal blog site on NameCheap.com

This blog site is currently hosted using the managed WordPress hosting called EasyWP available on NameCheap.com. Though there are many hosting services that integrate with the free Let’s Encrypt SSL tool, NameCheap does not seem to have any plan to do that. Instead NameCheap asks you to pay $8.88 per year for the basic SSL Certificate for the lock sign in front of the address bar in the browser when your website it accessed. Google and other search engines discourage visitors from visiting sites which does not have SSL cerficates for obvious security reasons. But there are free options to get a basic SSL certificate for your website. Many online articles describe the steps to get the certificate and install it through the CPanel.

However, as I figured through a chat with a customer service representative on NameCheap, that users only using EasyWP on NameCheap do not get access to CPanel. So adding SSL certificate manually through CPanel does not work. Instead what worked for me is SSL for Free which allows you to generate a free SSL certificate, a CA Certificate and a public key when you select manual verification. Just follow the steps on SSL for Free. Since this website already has a nice post to help you through the simple process, I am just sharing the URL here.

Hope this post helps you find your way to get that padlock icon next to your own blog site as well!

Leave a Comment

How much memory does importing PANDAS library take?

Objective

Let us compare the memory consumption of Python PANDAS library.

Methodology

Small script called memtest.py:


@profile
def a():
    import pandas

if __name__=="__main__":
    a()

Test it with

$python -m memory_profiler memtest.py

Results:

Output:

The increment of memory usage in line#4 shows, it took 36.527MB to import pandas on my machine. What does your benchmarking result look like?

Leave a Comment

Merge a repo with another as a subfolder

Sometimes we may end up with one main repository and another independently developed repository for a new feature. Later it may turn out, that the independent repo needs to become a part of the main repo as a subfolder.

To do that, we can use the git command using subtree. We need to put the new subfolder name and voila!

git subtree add --prefix=new_subdirectory_name git://github.com/userid/main_repo.git project_branch_name
Leave a Comment

Best way to setup PYTHONPATH for crontab

There are many suggestions on this.

Add PYTHONPATH at the end of the ~/.bash_profile or ~/.bash_login files. If they do not exist, add it to ~/.bash_profile as suggested by this StackOverflow post.

export PYTHONPATH="${PYTHONPATH}:/home/path/to/your/python/package/"

But this will add the package to the current user’s PYTHONPATH. To ensure crontab gets it right away. add this line to the top of crontab file, BEFORE crontab tries to execute any script of that package

PYTHONPATH/home/path/to/your/python/package
Leave a Comment

Best way to setup PYTHONPATH for crontab

When setting up a crontab job in Linux machine, these essential steps are required for a successful system operation

  1. Update the cron file by adding the new script on schedule
  2. Check the frequency of the schedule. Such as for running at 7 minutes interval, use
    */7 * * * *   python /path/to/script.py

    Or for running every hour at 7th minute, use

    7  *  *  *  *  python /path/to/script.py
  3. Check the file permission for the script. If it is not executable, then make it executable
    ls -l /path/to/script.py

    Then if the file is not executable by the user add the permission by

    chmod 755 /path/to/script.py

    (you may need to have sudo access if you are not the owner of the file. For changing ownership of a file or folder, use chown command on Linux variants)

  4. Check the file permission for the output site if the script produces some files. If the continuing folder or the output file does not have a write-access, ensure it is writable. Follow the similar process as in step#3.
  5. You may channel any print statements to a file such as
    7 * * * * python /path/to/script.py /path/to/output/file.txt 2&1
  6. Or better yet, use Python logging library to create useful log files and show if the program ran correctly.
Leave a Comment