Analytics

"The Skill that Industry Hires Need"

According to this post in Science magazine (geared toward PhD students and academics in science), industry employers particularly value project management skills in new hires, "including working in a team and delivering on schedule and on budget". I found this particularly striking because project management is perhaps the least used skill in graduate school. There is no timeline in submitting a research paper or getting a degree. Some doctoral students take five years to graduate; other take eight. There is little concept of a schedule to be kept, research-wise. In a way, not only are graduate students not taught project management, they are taught the opposite: it takes the time that it takes; what matters is the end result. No wonder then that graduate students with industry internships have an advantage over the competition when they seek industry positions.

(As a side note, this got me thinking about the skill that analytics students need the most, since I teach analytics rather than science. Of course project management is important for analytics students too, but based on my experience, students struggle the most with the idea that there might not be a single best model to be created from their data. You can create, say, a linear regression keeping only the coefficients that are significant at the 95% level, or you can focus on the 99% level, or you can have categorical variables, with some labels being very significant and others not as much, or you can have a logistic regression model and a classification tree model to predict a binary outcome, and each has its advantages and disadvantages. Students are disappointed sometimes when it feels like there isn't a unique best answer. But you create your models based on the data you have, and the process is necessarily imperfect. You can still get good insights from your model. )

Going back to the theme of this post, I think that undergraduate students learn more about project management through their capstone project at the end of their studies than doctoral students do. It makes sense, given that most undergrads go on to industry positions right after graduation (only a few get a Master's degree before starting work), but it is time to recognize the changed job prospects for PhDs too. Could we bring project management to academic research itself? Grant proposals ask us principal investigators to do as much, with budget justifications, deliverables and intermediate milestones, but graduate students are rarely involved in defining those. Maybe universities should provide more training on those matters.

Or maybe this could motivate a stronger emphasis on doctorate programs with time-constrained "praxis" capstone projects rather than dissertations, such as D.Eng. rather than PhDs. Perhaps it is even time for a renaissance of doctoral students that aren't PhDs in order to better meet industry needs, or the creation of an intermediary degree between Master's and PhDs. When I was at MIT, my department (Electrical Engineering and Computer Science) had a degree of Electrical Engineer, which was aimed at doctoral students who had completed all coursework in the PhD program: the All But Dissertation folks. Obviously most A.B.D.s don't plan on working in academia and maybe an advanced degree geared toward industry would be better suited for their career goals. This raises the issue of degree visibility and name recognition, if only a handful of universities deliver the new degree, but given today's pace of change, it'd make sense to introduce new degrees more suited to the needs of the workforce.

We could even imagine a system where students get credentials for each year of graduate study (or some number of credits to account for part-time students), with "Graduate Credential Level 1" being received at the end of the first year (maybe similarly to a Master of Engineering), "Level 2" at the end of the second year (equivalent to a Master of Science), and then adding "Level 3", "Level 4" etc, with the student being able to stop for a few years in-between if he so wishes. There is a lot of talk on campuses these days about continuing education, but it is unrealistic to expect these trends will fit neatly within existing degree programs. It is time for new graduate degrees.


INFORMS Wagner Prize: Call for Entries

Each year INFORMS grants several prestigious institute-wide prizes and awards for meritorious achievement. The Daniel H. Wagner Prize for Excellence in Operations Research Practice emphasizes the quality and coherence of the analysis used in practice. Dr. Wagner strove for strong mathematics applied to practical problems, supported by clear and intelligible writing. This prize recognizes those principles by emphasizing good writing, strong analytical content, and verifiable practice successes.

Applicants to this prestigious award have come from a variety of areas, such as: Health Care, Logistics, Supply Chain, Political Districting, Manufacturing, Cancer Therapeutics, Machine Learning, etc.

The deadline for submitting a 2-page abstract for the Wagner Prize is: May 1, 2017.

More information here.


From the Economist

Economist-20160903_cover
 Some interesting articles from a recent issue of The Economist:

  • Superstition ain't the way, about China's data: about the apparent unreliability of certain figures and the search for more reliable indicators. Some quotes: "Provincial GDP figures do not add up to the national total. Quarterly and annual growth do not always mesh... Growth has not dipped below 6.7%, even as prices slipped into deflation in late 2015... As far back as 2000, scholars turned to indicators like electricity consumption as a statistical refuge from what one called the "wind of falsification and embellishment" rustling the official data. But electricity is a less reliable guide as an economy evolves away from power-hungry industry toward low-wattage services." The article discusses the combination of rail freight, electricity and bank lending to gauge the national index, an index called the "Li Keqiang index" in honor of the China's premier who suggested it. The Federal Reserve Bank of San Francisco has improved the index by fitting it to GDP figures from 2000-2009 but, while its predictive power remained strong until 2012, it decreased afterward, perhaps because of the boom in financial services that recently boosted China's GDP. The article also describes other attempts at constructing more insightful indices with more data. Goldman Sachs has even combined 89 products. An issue remains to convert those output figures into monetary data. Official GDP figures, though, may be improving thanks to technology advances.
  • Called to account: The disturbing prosecution of Greece's chief statistician. This is a must-read, about the high price Greece's chief statistician, Andreas Georgiou, who also has 21 years experience at the IMF, is paying for trying to keep the numbers honest. His crime has been to estimate that the government's budget deficit in 2009 was 15.4% of GDP, which has been confirmed by the European Commission as accurate. His detractors blame him for the panic that led to Greece's bail-out in 2010 and for harsh conditions imposed by Greece's creditors. The Economist writes: "Courts have rejected these charges three times. But on August 1st the Greek supreme court reopened the case." Georgiou is also facing separate charges for "refusing to allow ELSTAT's board to use a vote to decide on the level of the deficit. Statistics are not supposed to work by ballot."
  • Fashion forward, about Zalando, which sells shoes and clothes online in Europe. (This business model is old-news in the United States but not in Europe.) Part of its success can be attributed to the close attention it pays on data. It also targets higher-value, brand-conscious shoppers, while Amazon targets more price-conscious customers, so hopes to retain the lead in the market even as Amazon and Alibaba builds their offerings in Europe.
  • Leaving for the city: a Schumpeter's column about how lots of prominent American companies are moving downtown. The main example is that of General Electric, who is moving from Fairfield, CT to the Boston waterfront in the new innovation district and apparently won't have a car park to encourage people to take public transportation. And since I saw the innovation district at length when I was back at MIT for my sabbatical two years ago, I can only say this is the sort of ideas that look good on paper and will distress employees when they see what the Silver Line is like. On the other hand, this might have the unexpected side effect of encouraging employees to get home in time for dinner and spend time with their families since they're not going to want to take the Silver Line late. Taking public transportation to work sounds appealing when you're in your 20s and live in an apartment in the city a few bus stops away from your workplace. If you want your kids to go to a top school district like Brookline or Newton, the commute is going to seem less appealing. And I mention this about Boston but it is true of other cities as well. My guess is that at some point the tax incentives are going to lose their luster. The Schumpeter columnist recommends reading "The Big Sort" by Bill Bishop and explains: "It argues that Americans are increasingly clustering in distinct areas on the basis of their jobs and social values. The headquarters revolution is yet another iteration of the sorting process that the book describes, as companies allocate elite jobs to the cities and routine jobs to the provinces." But Fairfield, CT was never a backwater. Additionally, the part of the column that distinguishes between mass headquarters in sunbelt cities and executive-headquarters in elite cities was quite interesting. For a good view of the future, read the part of the column about San Francisco. 

Best presentation at the #Analytics16 conference...

...(at least among the ones I've seen) was by Dr. Tim Niznik of American Airlines who gave an outstanding talk on hub disruption management. He had the last time slot on Tuesday, which is always a difficult time slot since so many people are leaving to catch an evening flight, but his talk was extremely well attended. His presentation was very engaging with a mix of visuals about weather maps and tools such as the diversion tracker and the gate demand chart. The goal is to figure out how to strategically delay some flights in order to minimize excess gate demand, minimize operations beyond airport closure time, minimize system-wide passenger impact and minimize delay introduced without violating crew/curfew rules. This talk was so good I hope Dr. Niznik can be one of the plenary or semi-plenary speakers at an INFORMS conference very soon. The high quality of his team's work deserves the visibility and dissemination to as large an audience as possible (and I don't even fly American). If you're a conference attendee, you can see his slides by logging into INFORMS Connect, clicking on My Communities and entering the Analytics 2016 community and browsing through the latest shared slides on the bottom left - he was part of the Tuesday Decision and Risk Analysis track.

The Analytics16 conference was one of the very best conferences I attended in recent memory and I'd like to thank Elea Feit for doing such an amazing job chairing the organizing committee (I know because I was part of the committee), as well as all my colleagues who helped put together such a remarkable event. The bar is set high for next year. The conference will be held in Las Vegas, NV, April 2-4, 2017. Mark your calendars!


Innovative Applications in Analytics Finalist: Detecting Preclinical Cognitive Change

This morning I attended a great talk as part of the "Innovative Applications in Analytics Finalist" track: Detecting preclinical cognitive change by Dr. Randall Davis and Dr. Cynthia Rudin of MIT. With the increased prevalence of dementia and Alzheimer's among the elderly, the associated health care expenses and the heart-wrenching situation of relatives who, when the disease is in an advanced stage, are no longer recognized by a dear parent, it is critical to diagnose cognitive decline as soon as one can so that early action can be taken and people can enjoy as much time as they can with a dementia-afflicted relative while this person is still himself or herself. Over 5m people have been diagnosed with Alzheimer's in the U.S. and the healthcare costs could soon be in the billions of dollars. The aging of the population also means that early diagnosis of dementia has emerged as one of the most pressing healthcare issues of our time. (The approach is applicable to other conditions such as sleep apnea.)

The talk showed how the classical Clock-Drawing test can be leveraged using new tools and technology to gain more information on a patient's cognitive state. There are in fact two clocks: the command clock (the patient is ordered to draw a clock showing a time of ten past eleven) and the copy clock (the patient is shown an image of a clock showing a time of ten past eleven and has to reproduce it). The key is to analyze the process of drawing the clock and not just the final result. The team of Dr. Davis, Dr. Rudin and their coauthors has been able to do that using a specially designed pen (equipped with a camera) and a special paper (which lets the pen know where it is on the piece of paper.) They call their test the digital clock drawing test. This allows them to measure key metrics such as the time it takes for the patient to draw the first hand of the clock after he or she has drawn the clock face and placed the numbers. It turns out that the pre-first hand latency - the time it takes for the patient to figure out where to draw that first hand of the clock - can help distinguish Alzheimer's from depression. Total Thinking Time is also an important metric, as was the "disappearing hooklet" on the first 1 of "11" in the numbers. (Basically when you are done drawing the first 1, you already think about drawing the second 1 starting from top to bottom so there should be a small hook at the bottom of the first 1, pointing toward the top of the second 1. A disappearing hooklet is one of the first signs of cognitive decline.)

In addition, the final result to characterize the patient-drawn clocks has traditionally been scored by physicians in widely different ways based on the distortion of the clock face, incorrect placement of the hands of the clock, and so on. The talk's authors showed convincingly how cutting-edge machine learning algorithms such as Supersparse Linear Integer Models (SLIMs) and Bayesian Rule Lists (BRLs) could be implemented to create decision rules that resembled the operational guidelines of physician-created scoring rules. This is important because it increases transparency and makes it more likely that physicians will implement those new methods because they are close to models they know. Physician-generated scoring systems achieved AUC (area under receiver operating characteristics curve) in the range of 0.66 to 0.79 where 0.5 is random and 1.0 is perfectly predicted. Machine-learning with all features achieved an AUC of 0.93 but is not as intuitive as the traditional physician-generated scoring systems. Machine-learning models based on SLIMs or BRLs achieve a tradeoff between those extremes with AUC of the order of 0.8, improving traditional physician-driven models but retaining high interpretability. As such, they are "Centaurs", or human-machine combinations that are better than either, applied to solving one of the greatest healthcare challenges of our time.

Digital Cognition Technologies, Inc. is marketing the technology, now pending FDA approval.

Read more about this research here (news release), here (papers of the MIT CSAIL Multimodal Understanding Group) and here (Dr. Rudin's papers). Specifically, you can read the paper that accompanies the Innovations in Analytics Award entry here: "Learning Classification Models of Cognitive Conditions from Subtle Behaviors in the Digital Clock Drawing Test." Fascinating stuff!


Monday's poster session

Analyticsposter2016I was very impressed by the poster session at Analytics16 yesterday, both in terms of the quality of the posters and the number of people who stopped by to ask me questions about my work. I was presenting my and Dr. Ruken Duzgun's research on multi-range robust optimization, with a focus on a case study we did comparing two-range robust optimization (2R-RO) with stochastic programming (SP). While traditional RO of the Bertsimas & Sim variety ends up only considering the nominal and worst-case value of each coefficient at optimality, multi-range RO can incorporate more than 2 scenarios (2R-RO can have up to 4 scenarios, for instance) and thus offers a bridge between traditional RO and SP. Our approach solves within seconds while SP hits the time limit of 1 day. You can read our papers here and here. Thanks to Sudharshana Srinivasan for taking my picture!


Richard E. Rosenthal Early Career Connection Program

The Richard E. Rosenthal Early Career Connection (RER ECC) on Sunday was a great success! RER ECC participants mingled with conference attendees selected for their shared expertise and record of contributions to INFORMS. I particularly want to thank Elea Feit, Mike Trick and Robin Lougee for taking the time to share their insights with the RER ECC participants (with apologies to anyone I forget). The program targets young professionals only a few years into the workforce. I was very impressed by the record of accomplishments of this year's cohort and their ability to potentially implement large-scale analytics at companies like General Motors or Air Liquide. All of them were extremely articulate in addition to exceptionally talented and I am sure we will hear from them in the future for their operations research accomplishments. I am including the slide my co-chairperson Tarun Mohan Lal of Mayo Clinic prepared for the introduction of the RER ECC participants during one of the Analytics16 keynote addresses, which contains their name and picture (make sure to congratulate them on their selection to RER ECC if you see them) and a picture of the reception taken by RER ECC ' 15 alumna Sudharshana Srinivasan, who was instrumental this year in helping Tarun and me deliver a high-quality Richard E. Rosenthal Early Career Connection program.

Eccreception2016


ECC research at INFORMS Annual Meeting

Today's post is about the presentations two participants to the Richard E. Rosenthal Early Career Connection program - Shokoufeh Mirzaei and Ehsan Salari - did of their work at the 2015 INFORMS Annual Meeting in Philadelphia, PA. (Nominations for the 2016 ECC program are now open and due March 4! More information is available here.) 

IMG_3213Dr. Mirzaei (Shokoufeh thereafter), who is a tenure-track Assistant Professor at Cal State Pomona, gave a talk regarding open problems on computational structural biology and protein quality assessment. She explained why researchers care about the structure of the protein (shape affects function). Many diseases happen as a result of misfolding proteins (for instance Alzheimer's and Parkinson's) and it is important in the drug discovery process to understand a protein's interaction with other proteins. The goal of this line of work is to design new proteins with desired functions not currently found in nature, with the hope that computational work will be able to replace at least in part experimental work, i.e., drug trials on individuals. A challenge is that there is no clearly defined energy function so there is no clear objective to minimize. Criteria that can be used in the optimization framework include: hydrogen bonds, Van der Waals interactions, backbone and angle preferences, electrostatic interactions, and more. Those criteria lead to very nonlinear and non convex expressions, making the problem even more challenging. 

Shokoufeh then discussed the Protein Data Bank (which offers opportunities for template-based, homology-based and free modeling) and the WeFold Coopetition, the purpose of which is to encourage "coopetition" (competition+cooperation) among labs to improve the state of knowledge regarding protein structure prediction. Certain classes of prediction targets have only seen modest gains over the past few years and such an event therefore had the potential of speeding up the rate of discovery. Open problems in the field include best scoring function and best metrics to compare two protein structures. Then Shokoufeh commented on the problem of creating a benchmark data set for testing different proteins and discussed computational approaches such as MESHI (using the clustering nature of proteins) and Support Vector Machine.  

This is a field I know nothing about and I was struck by the clarity of Dr. Mirzaei's presentation as well as the effectiveness of her communication skills in making very complex problems understandable to a lay audience. She proved a very articulate and effective speaker who convincingly made the case for her research. In today's world, analytics professionals must not only have the quantitative tools to make a difference but also communicate their work effectively and there is no doubt Dr. Mirzaei will soon be a star in her domain. 

IMG_3218Dr. Salari (Ehsan thereafter), Assistant Professor at Wichita State, gave a talk entitled: "Biologically-guided radiotherapy planning: fractionation decision in the presence of chemoradiotherapeutic drugs." His talk was based on his recent paper published in IIE Transactions in Healthcare Systems Engineering. You can find a technical paper version of the work here. Radiotherapy (RT) uses high-energy radiation beams to kill cancer cells by damaging DNA. The survival rate of cells when exposed to different rates of radiation follows a linear-quadratic model. RT treatments are delivered through daily fractions. A regimen is thus determined by the number of fractions and the radiation dose. Effects of fractionation are accounted for by using a concept called the biologically effective dose. (BED)

Chemotherapy can be done sequentially or concurrently with RT but increases the risks of complication. It is therefore important to determine the impact of chemotherapeutic agents on optimal RT fractionation regimens. Additivity and radio-sensitization both affect the linear-quadratic curve depicting the survival rate of cells. Ehsan proposed an approach to extend the BED model to quantify the radiation damage and studied the optimal radiation fractionation regimen and the drug administration scheme under 4 schemes: RT only, CRT with additive effects only, CRT with radio-sensitization effects only, CRT with combined effects. His presentation contained many insightful graphs on the structure of the optimal regimen, which you can also find in his paper

Like Shokoufeh, Ehsan proved to be an exceptional researcher delivering a compelling, insightful presentation of quality far above the average presentation at the annual meeting. I am looking forward to reading other papers by him.


#Analytics 2016: Richard E. Rosenthal Early Career Connection nominations now open!

The Richard E. Rosenthal Early Career Connection (ECC) program, to be held during the INFORMS Analytics 2016 conference in Orlando, FL April 10-12, 2016, is now accepting nominations. Nominations are due March 4, 2016.

Last year we redesigned the program to allow more interactions between participants and also between participants and conference attendees. Innovations that we plan to continue this year include inviting INFORMS leadership to the ECC reception for mingling with participants on Sunday, reserved tables for ECC participants and select conference attendees (invited based on ECC participants' research and work specialties) for Monday's lunch and Tuesday gathering during the coffee hour. 

From the website: 

The Early Career Connection (ECC) is a program of special events designed for professionals who are new to their academic or industry careers. The program facilitates networking and introduces participants to well-established researchers and practitioners for more effective communication. Participants benefit from a discount on registration to the conference, as well as the networking events exclusive to ECC participants. The discounted registration rate is $615.

Benefits

The mission of ECC is to provide early-career professionals with new perspectives into some of the most critical problems facing industry today, enabling them to broaden their research agendas. The goal is for these analytics and OR leaders of the future to have an opportunity, early in their careers, to apply their outstanding analytical talents to important business problems.

Those nominated and selected for this honor will receive a reduction of the conference registration fee. Awardees are expected to participate fully in the ECC events, as well as the conference sessions and social programs. We also strongly encourage all ECC nominees to submit a paper or poster to the Select Presentations or Poster Presentations (however, this is not a requirement for acceptance to the ECC).

To Nominate an ECC Participant

Please send an email to ellen.tralongo@informs.org by March 4, 2016, containing the following information. The nominator must be from the same organization as the nominee. 

blue-checkmark Nominee’s name, email and telephone number

blue-checkmark Nominator’s name, email and telephone number

blue-checkmark Type of degree, year of nominee’s degree, and institute that awarded the degree

blue-checkmark If nominee has a master’s degree, starting date at company

blue-checkmark Organization and department

blue-checkmark Nominee’s position at the organization

blue-checkmark A brief paragraph from the nominator explaining why this person is being nominated (50-150 words)

blue-checkmark A brief paragraph from the nominee describing their relevant research or analytics/OR project (50-150 words)