Food/activity tracking apps

I am back to trying to get an insight on quantifying my life and am running into the same problem that I used to always experience with the activity/food trackers in the past. They are simply not made to encourage people to change and maintain changes. Just a couple of problems to start with:

  • The activity tracking suggests at least ~150 minutes cardio per week. If a new user is just starting and switching from a sedentary lifestyle and are trying to go into an active one, this will be deadly to them – the most they can carry out is 60 minutes of cardio at maximum for the first month and a half. Trying to get to 150 minutes is a guaranteed recipe for failure to adhere more than for the first week or so, either because of the lack of will or lack of because they will hurt themselves by trying to ramp up too fast. A better way of doing it would be to take ~2 weeks of monitoring upon each uninterrupted session, then suggest a ramp-up that would gradually improve the habits of the user in a way that would stick in the long run.
  • In my own experience, the reason a lot of people end up in a pretty bad shape is not necessary because they don’t know any better, they don’t have the time because of their work and other occupations, that constantly make self-care slide to the end of their list of priorities. A lot of activity/food tracking solutions require a lot of active input from the user and because of that, tend to have a low adherence rate, especially in the long term. A much better option would be to perform monitoring in the long run that requires almost none
  • Specifically for the food trackers – the lack of a unified repository of products and ability to fraction amount of them consumed. I was able to find for some of them teas that contain cholesterol (WTF?), but wasn’t able to see what was in unless I reviewed the labels.
  • And as per usual, the current state of the trackers is deplorable when it comes to measuring anything outside the calories. A lot of “healthy” foods are healthy not so much because they contain fewer calories, but because they contain a lot of micro-elements and vitamins that make them cover and prevent cravings in the long run.

Bonsu point: Apple health app unifying different apps. That doesn’t seem like much, but it definitely stitches all the apps together into one, making sure the information flows inside the health app ecosystem, allowing me to log in an activity once, as opposed to 3-4 times before that, and still benefit from the best of all the apps without having to deal with the worst.

Sleep monitors and internet of things

I do think that the sleep monitors should not require an active action from the user to activate them every night. Instead, it should be something that runs in the background – like GPS or pedometer in your telephone for walking distance monitoring.

Hence I see a tool that would be having two following functions:

  • movement detection for the quality of sleep computation
  • light detection, in order to figure out when you are sleeping or could be potentially sleeping

Usability of adhesion systems

Catch-22 with a pretty large health insurance website: – you need to give us the first payment to get your card – in order to perform a payment, you need to log-in. – to log-in your first need to register – to register you need you adhesion number – to get your adhesion number, you first need your card.

Best part? When I tried calling, I had to wait ~ 1 hour to get connected to the right person, a with every telephone tree branch saying to me that I needed to go to the website to do everything I needed. In addition to that, after waiting all that time, I was told I needed to wait until the invoice was generated.

Morale:

  1. Make sure you solicit user’s action only when your system is ready for it and when that action is likely to succeed.
  2. Make your user create an account that would be recognized from the go, even if it would mean that there will be nothing shown on his account.
  3. Have a collection point where the reports of your “happy system” malfunctions would go.
  4. Register failures to properly use the interface and progressively build a database of corner cases and edit your system fall-backs to account for them.
  5. Always test for usability to check that there are no catch-22 that will waste your tech support time.

Bonus points for the website – there is a paper invoice I hold in my hands, but the website shows that no invoice was generated I could pay for. Final bonus point – COMIC SANS. On the main USER-facing GUI page. Overriding other “sane” types.

Writing a research paper with ReStructuredText

Why?

As a part of my life as a Ph.D. student, I have to read a large number of scientific papers and I have seen a couple of them written. What struck me was that these papers have a great deal of internal structure (bibliographic references, references to figure, definitions, adaptation to a particular public, journal, etc…). However, they are written all at once, usually in a word document, in a way that all that structure lives and can only be verified in the writer’s head.

As a programmer, I am used to dealing with organizing complex structured text as well – my source code. However, the experience has shown me that relying on what is inside your brain to keep the structure of your project together works only for a couple of lines of code. Beyond that, I have no choice but to rely on compilers, linters, static analysis tools and IDE tools to help me dealing with the logical structure of my program and prevent me from planting logical bombs and destroying one aspect of my work while I am focusing on the other. An even bigger problem is to keep myself motivated while writing a program over several months and learning new tools I need to do it efficiently.

From that perspective, writing the papers in a word file is very similar to trying to implement high-level abstraction with very low-level code. In fact, so low-level, you are not allowed to import from other files (automatically declare common abbreviations in your paper), define functions and import upon execution. Basically, the only thing you can rely on is the manuscript reviews by the writers and the editors/reviewers of journals where the paper is submitted. Well, unless they get stuck in an infinite loop.

Alt text

So I decided to give it a try and write my paper in the same way I would write a program: using modules, declarations, import, compilation. And also throwing in some comments, version control and ways to organize revisions.

Why ReStructuredText?

I loved working with it when I was documenting my projects in Python with Sphinx. It allows quite a lot of operations I was looking for, such as .. include:: and .. replace:: directives. It is well supported in the PyCharm editor I am using, allowing me to fire up my favorite dark theme and go fullscreen to remain distraction-free. It also can be translated with pandoc to both .docx for my non-techy Professor and LaTeX files for my mathy collaborator.

It also allowed me to type the formulas with raw mathematics in LaTeX notation quite easily by using .. math:: directive.

How did it go?

Not too bad so far. I had some problems with pandoc’s ability to convert my .rst file tree into .docx, especially when it came to failing on the .. import:: and citation formatting. (link corresponding issues) There was also some issue with rendering .png images in the .docx format as well (link issue). In the end, I had to translate the .rst files into html and with rst2html tool and then html to docx. For now, I am still trying to see how good of the advantages it is giving me.

After some writing, I’ve noticed that I am now being saved from the pending references. for instance, at some point I wrote a reference [KimKishnoy2013]_ in my text and while writing biblio I realized the paper came out in 2012, so defined the paper there as .. [KimKishnoy2012]. And rst compilation engine threw an error on compilation about Unknown target name: "kimkishnoy2013" Yay, no more dead references! The Same thing is true for the references defined in the bibliography but not used in the text.

Now a shortcoming of this method of writing is the fact that inter-part transitions do not come naturally. It can be easily mitigated once the writers’ block has been overcome by writing all parts separately by opening a compiled HTML or .docx document and editing the elements os that they align properly.

An additional difference with the tools that has been developed to review code is that the information density and consistency in the programming languages is closer to mathematical notations rather than a human-readable text with all the redundancy necessary for a proper understanding. A word change or a line change is a big deal in the case of programming. It isn’t so important in the case of writing. and all the annotation and diff tools used for that are not very useful.

On the other hand, it is still related to the fact that human language is still a low-level language. Git won’t be as useful to review the binaries that it is when reviewing the programming languages that are important.

Over the time, a two significant problems emerged with that approach. First – incorporating the revisions. Since the other people in the revision pipeline are using the MS word built-in review tools, in order to address every single revision I have to find the location in the text tree file where the revision needs to be made, then correct it. Doing it once is pretty straight-forward. Doing it hundreds upon hundreds of time across tens of revision by different collaborators is an entire thing altogether and is pretty tiresome.

The second problem is related to the first. When the revisions are more major and require re-writing and re-organization of entire parts, I need to go and edit the parts one by one, then figure which part contents is going into what new part. Which is a lot of side work for not a lot of added value in the end.

What is is still missing?

  • Conditional rendering rules. There are some tags that I would want to see when I am rendering my document for proofreading and correction (like parts name, my comments, reviewer comments), but not in the final “build”.

  • Linters. I would really like to run something like a Hemingway app on my text, as well as some kind of Clonedigger to figure out where I tend to repeat myself over and over again and make sure I only repeat myself when I was to. In other terms automated tools for proof-reading the text and figuring out how well it is understood. It seems that I am not the only one to have the idea: Proselint creators seem to have had the same idea and implemented a linter for prose in python. Something I am quite excited about, even though they are still in the infancy because of how liberal the spoken language is compared to programming language. We will likely see a lot of improvements in the future, with the development in NLP and machine learning. Here are a couple of other linter functions I could see be really useful.

    • Check that the first sentence of each paragraph describes what the paragraph will be about
    • Check that all the sentences can be parsed (subject-verb-object-qualifiers)
    • Check that there is no abrupt interruption in the words used between close sentences.
    • Check for the digressions and build a narration tree.
  • Words outside common vocabulary that are undefined. I tend to write to several very different public about topics they are very knowledgeable about and sometimes not really. The catch is that since I am writing about it, I am knowledgeable about them, to the point that sometimes I fail to realize that some words might need. If I have an app that shows me words that I introduce that are rare and that I don’t define, I could rapidly adapt my paper to a different public or like the reviewers like to ask unpack a bit.

  • chaining with other apps. There are already applications that do a great job on the structuring the citation and referencing to the desired format. I need to find a way to pipe the results of .rst text compilation into them so that they can adapt the citation framework in a way that is consistent with the publication we are writing.

  • Skeptic module. I am writing scientific papers. Ideally, my every assertion in the introduction has to be backed by an appropriate citation and every paragraph in the methods and results section has to be backed by the data, either in the figures or in supplementary data.

  • A proper management of mathematical formulas. They tend to get really broken. Latex is the answer, but it would be nice if the renderings of formulae could also be translated into HTML or docx, that has it’s own set of mathematical notation (MS office always had to do it differently from open source standards).

  • Way to write from the requirements. In software we have something we refer to as unittests: pre-defined behaviors we want our code to be able to implement. As a rule of thumb, accumulation of unittests is a tedious process, that is nonetheless critical for building a system and validating that upon change our software is still behaving in a way we expect it to. In writing we want to transmit a certain set of concepts to our readers, but because the human time is so valuable, we regularly fail at that task, especially when we fail to see that 100th+ revision makes a concept that is referred to in a paper not be defined anymore. In software it is a little bit like acting as a computer and executing the software in your head. Works well for small pieces, but there are edge cases and what you know about what program should do that really gets into the way.

My Python programming stack for a refactoring job

IDE:

Pycharm professional edition:

  • Marvelous support for refactoring: renaming, moving\
  • Marvelous support for code testing coverage
  • Excellent support for styling (PEP8, PEP308)
  • Marking of where code was edited + side bar allowing a really easy navigation through code
  • Split edition windows (in case you are editing two very distant locations withing the same file)
  • Support for all the file formats I ever have to work with – Excellent integration with git/GitHub

Before I push:

flake8 --select=C --max-complexity 10 myProject

  • As a rule of thumb, everything with McCabe complexity over 15 needs to be refactored
  • Everything with complexity over 10 needs to be looked into to make sure method is not trying to do too much at once.

clonedigger

  • Tells the complementary story of the project complexity and redundance
  • In my experience, more easy to read and understand than pylint version
  • Allows me to poll together methods that do same things and extract it as a function for better control

Upon push:

Travis-Ci.org + Coveralls.io

  • Made me think about how my end users will install it
  • Forces me to write unittests wherever possible.
  • Unittests + coverage are good: they force me to think about paths that code can be taking and implement proper tests or insert nocoverage statements
  • Once implemented, they make testing the code much simpler

Quantified Code.comLandscape.io

  • Checks my code for smells and advice consistent with different PEPs.
  • Cites sources, so that I don’t have just the verification, but also the spirit behind its introduction
  • Sends me an e-mail upon each push telling me if I’ve introduced additional smells or errors upon pushing
  • In the end fairly complementary to the tools I already have when it comes to the static analysis’

In addition to all of the above:

Developer Decision log:

  • Because I need to have a bird’s eye view of the project, I put all the decisions or decisions to make as I read and/or refactor code there, to make sure I know what is done and what is to do.
  • Unlike issues, it is a little bit more in-depth explanation of what is going on, not necessarily to be shown in the docs, nor necessarily worth opening/closing an issue.

What is missing

Profiling tools. I would really like to know how the modifications I introduce to code impact performance, but I don’t think I have yet found a way of doing it other than log everything and profile specific functions when I feel they are being really slow.

On the other premature optimization never brought anything good and a proper structuring of the project would make profiling and optimization easier in the long run.

Python: importing, monkey-patching and unittesting

I am in the phase of refactoring a lot of my code from several years ago for a project relying a lot on module-level constants (like database connection configurations).  For me, defining constants in the beginning of the module and then several functions based on them that I will be using later on in the code instead of wrapping all the internals in a class that is dynamically initialized every time one of its methods needs to be used elsewhere just sounds much more Pythonic.

However I have been progressively running into more and more issues with this approach. At first, when I tried to use Sphinx-autodoc to extract the API documentation for my project. Sphinx imports modules one by one in order to extract the docstrings and generate an API documentation from them. Things can get messy when it does it on the development machine, but things get worse when the whole process is started in an environment that doesn’t have all the necessary third-party software installed, that would allow for instance a proper database connection. In my case I god hurt by the RTFD and had to solve the problem through the use of environment variables.

on_rtd = os.environ.get('READTHEDOCS', None) == 'True'

This, however lead to the pollution of production code with switches that were there just to prevent constants initialization. In addition to that, a couple of months down the road, when I started using Travis-Ci and writing unit-tests, this practice of using modules came back to bite me in my back again. In fact, when I was importing the modules that contained functions that relied on interaction with database, it automatically pulled the module that was responsible for connection with database and attempted to connect it with the database that was not necessarily present in the Travis-Ci boxed environment nor that I would be particularly eager to test while testing a completely function.

In response to that, I can see several possible ways of managing it:

  • Keep using the environment variables in the production code. Rely on RTFD to supply READTHEDOCS environment variable and set the UNITTEST environment variable when the unittesting framework is getting started. Check for those environment variables each time we are about to perform an IO operation and mock it if they are true.
  • Instead of environment variables, use an active configs pattern: import configs.py and read/write variables within it from the modules where it gets imported.
  • Pull together all the active behavior from the modules into class initialization routines and perform initialization in the __init__.py for classes, once again depending of what is going on.
  • Use the dynamic nature of Python to monkey-patch actual DB connection module before it gets imported in the subsequent code.

Following a question I’ve asked on Stackoverflow, it seems that the last option would be the most recommended, because it does not involve increasing the complexity of the production code, just move elements to the module that implements the unittesting.

I think that what I would really need to use in Python would be a pre-import patch that would replace some functions BEFORE they are imported in a given environment. All in all it leaves an uneasy feeling of the fact that unlike many other parts of Python, the import system isn’t as well thought through as it should be. If I had to propose an “I’d wish” be of the Python import system, these two suggestions would be the biggest ones:

  • Import context replacement wrappers:
    @patch(numpy, mocked_numpy)
    import module_that_uses_numpy
    
  • Proper relative imports (there should always be only one right way of doing it):
    <Myapp.scripts.user_friendly_script.py>
    
    from MyApp.source.utilities.IO_Manager import my_read_fle
    
    [... rest of the module ..]
    
    
    > cd MyApp/scripts/
    > python user_friendly_script.py
       Works properly!

    Compare that to the current way things are implemented:

  • > python -m MyApp.souruser_friendly_script
       Works properly!
    > cd MyApp/scripts/
    > python user_friendly_script.py
       Fails...
    

It seems however that the implementation of the pre-import patching of modules is possible in Python, even if it is not really that easy to implement.

After digging through this blog post, it seems that  once modules have been imported once, they are inserted into the `sys.modules` dictionary that buffers them for future imports. In other terms, if I want to do run-time patching, I will need to inject a mock object into that dictionary to override the name that was originally that is used in importing and that leads to the secondary effect of database connection.

Provided that sys.modules modification has a potential to break the Python import machinery, a (relatively) saner injection of Mock module would have been to insert a finder object into sys.meta_path which won’t break the core python import mechanics. This can be achieved by implementing a find_module() class within the importlib.abc.Finder. However, these methods seem to be specific to the Python 3.4 and that we might need to run an alternative import from a path that would instead patch the normal module behavior and mock database connection.

Let’s see if I will manage to pull this one off…

Dependency of a dependency of a dependency

Or why cool projects often fail to get traction

Today I tried to install a project I have been working for a while on a new machine. It relies heavily on storing and quering data in “social network” manner, and hence not necessarily very well adapted to the relational databases. When I was staring to work on it back in the early 2013, I was still a fairly inexperienced programmer, so I decided to go with a new technology to underlie it neo4j graph database. And since I was coding in Python and fairly familiar with the excellent SQLAlchemy ORM and was looking for something similar to work with graph databases my choice fell on the bulbflow framework by James Thronotn. I complemented it with JPype native binding to python for quick insertion of the data. After the first couple of months of developer’s bliss and everything working as expected and being build as fast as humanely possible, I realized that things were not going to be as fun as I initially expected.

  •  Python 3 was not compatible with JPype library that I was accessing to rapidly insert data into neo4j from Python. In addition to that JPype was quickly dropping out of support and was in general too hard to set up, so I had to drop it down.
  • Bulbflow framework in reality relied on the Gremlin/Groovy Tinkerpop stack implementation in the neo4j database, was working over a REST interface and had no support for batching. Despite several promises of implementation of batching by it’s creator and maintainer, it never came to life and I found myself involved in a re-implementation that would follow that principles. Unfortunately I had not enough experience with programming to develop a library back then, nor enough time to do it. I had instead to settle for a slow insertion cycle (that was more than compensated for by the gain of time on retrieval)
  • A year later, neo4j released the 2.0 version and dropped the Gremlin/Groovy stack I relied on to run my code. They had however the generosity of leaving the historical 1.9 maintenance branch going, so provided that I had already poured along the lines of three month full-time into configuration and debugging of my code to work with that architecture, I decided to stick with 1.9 and maintain them
  • Yesterday (two and a half years after start of development, when I had the time to pour the equivalent of six more month of full-time into the development of that project), I realized that the only version of 1.9 neo4j still available for download to common of mortals that will not know how to use maven to assemble the project from GitHub repository is crashing with a “Perm Gen: java heap out of memory” exception. Realistically, provided that I am one of the few people still using 1.9.9 community edition branch and one of the even fewer people likely to run into this problem, I don’t expect developers will dig through all the details to find the place where the error is occurring and correct it. So at that point, my best bet is to put onto GitHub a 1.9.6  neo4j and link to it from my project, hoping that neo4j developers will show enough understanding to not pull it down

All in all, the experience isn’t that terrible, but one thing is for sure. Next time I will be building a project I would see myself maintain in a year’s time and installing on several machines, I will think twice before using a relatively “new” technology, even if it is promising and offers x10 performance gain. Simply because I won’t know how it will be breaking and changing in the upcoming five years and what kind of efforts it will require for me to maintain the dependencies of my project.

Usability of fitness trackers: lots of improvement in sight

Fitness trackers and other wearable techs are gaining more and more momentum, but because of the ostrich cognitive bias they are absolutely not reaching the populations that would benefit most from them. And as per usual, ReadWriteWeb is pretty good at  pointing this out in a simple language.

To sum up, current fitness tracking has several short-comings for the population it would target:

  • It is pretty expensive. Fitness band that does just the step tracking can cost somewhere between $50 and $150. If you are trying to go something more comprehensive, such as one of the Garmin’s multisport watches, you are looking for somewhere in the $300-$500. Hardly an impulsive purchase for someone who is getting under 30k a year and have kids to feed from that. However they are the group at highest risk from obesity and cardiovascular disease.
  • They generate a LOT of data that is hard to interpret, unless you have some background as a trained athlete. Knowing your Vmax and hear-rate descent profile following an error is pretty cool and useful for monitoring your health and fitness, but you will never know how to do it, unless someone explains it to you or you already knew it from your previous athletic career.
  • They do not provide any pull-in. As anyone with a bank account would know, saving comes from the repeated effort in duration. Same as with health capital. However, as anyone with a bank account knows, when you hit hard financial times, you watch your bank account much less than during the times where everything is going well. Just because it is rewarding in the latter case and painful in the first. Same thing with health: people who lack health but are ready to do it are self-conscious about it and need an additional driving motivation to make them last through the periods where no progress is happening
  • It does not respond to an immediate worry and is one of those products that are “good to have”, but whose absence does not lead to a “I need it RIGHT NOW” feeling

 

With that in mind, I decided to participate in MedHacks 1.0 last weekend. My goal was to develop something that would provide an emergency warning for users that are either at high risk of stroke or undergoing it, so they would not get themselves isolated while having a stroke. With my team, we managed to hack together a proof of concept prototype in about 24 hours, which took us into finals. In order to do this, we used an audio mixing board to amplify the signal, Audacity to acquire the data on a computer, FFT and pattern matching to retrieve the data and filter out loss-of-contact issues and build an app in Android that was able to send out a message/call for help if the pattern changed.

Now, those are very simple components that could be compressed on a single sensor woven into a T-shirt and beamed onto a phone for analysis in background. We would need to do some machine learning to be able to detect most common anomalies and then validation by human experts of the acquired EKG.

However, the combination of persistently monitoring cheap device and an app that is able to exploit it opens large possibilities for fitness tracking for those most needing it.

  • The reason to purchase and use the monitoring device is not fitness anymore. It is basic safety. And can be offered by someone who is worried for your health.
  • The basic functionality is really clear. Something is going on wrong with you, we will warn you. Something is going really wrong, we will warn someone who can check on your or come to your rescue.
  • We can build upon the basic functionality, introducing our users to the dynamics of fitness in a similar way games introduce competitive challenges: gradually and leaving you the time to learn at your pace.
  • We have a very precise access to the amount of effort. Your heart rhythm will follow if you are doing a sternous directed activity and we will guide you in it
  • We were able to build a prototype with very common materials. Compression and mass-production will allow us to hit the lowest market range, at a price where you are paying for a smart athletic piece of clothing only marginally more than for the same “non-smart” piece of clothing.

Sounds interesting? I am looking for someone with clinical experience in hear diseases, a hardware hacker that would have experience with wearable and someone to drive the consumer prospection and sales.

Competition does not always bring quality: case study of shopping apps

The problem is simple. I would like to have an app to help me manage my shopping list for me.

Until now I have been using AwesomeNote’s notebook filled with a lot of “todo” boxes and a separate note for each shopping session. This was kinda working ok, but could be better.

First, I realized I had plenty of checkboxes unchecked from a previous shopping session that I still might want to be aware of when I am shopping.  What would be really cool, is that there could be an overall checkbox set where once I would have checked out something it would disappear. Until I add next time. Or even better, until it popped out itself: it shouldn’t be hard to predict what I am buying weekly or even monthly and add it automatically.

Second, I realized that my shopping list was context-dependent. I might do most of my grocery shoppings at one place, but sometimes I need something specific from a different shop, where I don’t go that often. By the time I reach it, the note I’ve made it buried deep underneath my shopping lists. Some location-awareness could be pretty cool.

Finally, I kinda don’t like typing too much, especially if it’s the same thing. If it could do a nice autocomplete or even an intelligent UI that would save me time spend in the app, that would be pretty cool.

Having a pretty good picture of what I wanted (location-aware shopping list app with a quick UI and predictive analytics ) I set out to find one. There are literally thousands of them all over the appstore; there should be at least one that would fit me needs, no?

Nope. Despite all the power of google and AppCrawler I am still looking for the one I want.

TO BE CONTINUED…