## On Estimating Thailand’s Infection Fatality Ratio (IFR)

This post is to record the method used to create the estimate for Thailand’s IFR metric nationally and by province for death estimations and their distributions per province. This was sparked by Matt Greenfield and Dylan Jay’s discussion. Obviously, there’s too much to fit in a tweet.

In the article, we show (1) how readers can directly calculate their own estimates for Number of Infected Cases from our data, followed by (2) method taken to reach the results, then (3) discussion, limitation and conclusions. (**Be sure to read the caveats.**)

## How to Calculate Number of Infected Cases (est.) in Thailand from IFR

**UPDATE 23:00, MAY 11TH 2021**

After further effort sparked by Dylan Jay, we have worked through to achieve an agreeable answer (final breakthrough by Matt Greenfield) to translate these IFR scores in the righthand side plot below into Number of Infected Cases (as lower and upper bounds). Infected Cases differ from actual Recorded Cases, and they offer a view on whether more cases are likely to be found.

Monumental stuff for Thailand Public COVID-19 data reporting. In a few days, we’ll see about including this in daily information. In the meantime, you can apply your own calculations as follows; and take notice of the caveats described in this article, when drawing conclusions...

##### for Thailand

Thailand’s total deaths is 336 for our dataset in the date range 2021-03-02 to 2021-05-10. The reported IFR is [0.015 to 0.025]. In this case, the plot is showing the mean IFR from all provinces (/77). So we must reverse that mean average to get the sum, then proceed to calculate estimated infected cases, as follows:

This is the Min, Max and Mean expected Number of Infected Cases, for the date period.

We calculate the difference from Recorded Cases (58,048) to decide whether more (or less) cases are expected to be found. It looks like Recorded Cases are in the upper half of the range, so case finding looks good across Thailand.

##### for Each Province

This method applies for all provinces. Bangkok is our example. Bangkok’s total deaths is 148 for our date range. The reported IFR is [0.443 to 0.753]. In this case, the IFR is multiplied by x100 (I’ll fix it in a few days). So we must reverse that to get a calculable IFR, before calculating Infected Cases as follows:

Again, Min, Max and Mean expected values of Infected Cases are shown for the date period.

We can calculate the difference from Recorded Cases (20,781). For Bangkok, it looks like the Recorded Cases are on the lower half of the range; therefore, it looks like many new cases are still to be found in Bangkok.

See the righthand side plot and follow the calculation, to discover Estimated Number of Infected Cases for your province.

## Infographic: May 10, 2021

(Not yet suitable for reproducing or sharing).

## Method

The method has caveats and assumptions, as follows:

• The Cases and Deaths data is from the cases by province dataset, which holds it’s first recorded death associated to a province on 2021-03-02 and it’s latest recorded death on 2021-05-10. Only this range’s cases and deaths are used in this analysis. In this dataset’s range, there were 58048 recorded cases and 336 recorded deaths. For reference, in the covid19.th-stat dataset between 2021-03-02 and 2021-05-10, there were 58,932 (85005-26073) reported cases, a discrepancy of 884; and 337 (421-84) reported deaths, a discrepancy of 1.

• This National Statistics Office Age distribution dataset was used, extracting 2010 data only. When it was created Bueng Kan didn’t exist, so I copied Nong Khai’s data (Bueng Kan was split from Nong Khai in ~2017, so it is likely it held the same age distribution) and used a normalised age distribution, per age bucket.

• In the Age Distribution dataset there are 17 buckets. 0-5,…. 75-79 and 80+. I split 80+ into 80 to 85, to match epimonitor’s IFR analysis‘s table of age distributions and risk levels.

• Therefore, I have split the deaths based on the age distribution per province, and that effectively gives the green line. In fact the green line is further multiplied by the IFR Risk % to bring it closer to COVID-19 death likelihood; yet this has a smaller visual impact.

• From epimonitor’s IFR analysis‘s table of age distributions, I interpolate linearly between the known age and Risk% values, and extrapolate (backfill) to 0 with a value matching the earliest entry (i.e. 5yo to 0yo each have equally likelihood of passing).

• Taking account of epimonitor’s IFR analysis‘s table’s Risk% dependency on 100.000 cases, I calculate the factorisation to bring recorded Deaths and Cases in-line with the Risk% (per100k).

At this point, I separate the Min and Max of each age bucket’s record and allocate the appropriate (interpolated) Risk% to each, thereby giving each row (e.g. the 10-15yo bucket) a specific minimum Risk% (for 10yo) and specific maximum Risk% (for 15yo). Effectively, gives us the ranging data between the orange and blue lines and upper/lower bounds estimates relative to 100,000 cases.

Next I reduce the Min/Max Risk% estimates from 100k to the provincial relative numbers. As each of the Min/Max Risk% values are relative to 100,000, I normalise these. Then apply the province’s normalised age bucket population factor, which is proportional to the province’s true population. Then finally, having factorised the case/deaths (to/from 100k), I use this to reduce the Expected IFR (death) rates (in orange and blue) down to our province’s relative death rates. This is just the factor of 100k/case numbers.

This gives us a reasonable ranging estimate for expected deaths and expected age ranges for those deaths, relative to the Province’s age distribution.

## Righthand Side Plot – IFR

In the righthand side plot (above), an Estimate for IFR is reported, as defined in the equation below. This is calculated from the number of deaths split over the stratified age distribution per province (as described above) and multiplied by the corresponding Risk% column value from epimonitor’s IFR analysis page. These score values are summed, giving IFR as a percentage. This calculation of IFR has not been x100.

The estimated national IFR is shown in min/max range: [0.015-0.25]. The individual province IFRs are shown similarly, for example: Bangkok: [0.443-0.753] and Chiang Mai: [0.049-0.083].

Within Imperial College London’s (ICL) Oct 2020 article on IFR, the authors indicate that a value of 1.15% (0.78-1.79) and 0.13% (0.14-0.42) were normal at the time of writing for “high income” (more elderly people) and “low-income” (more young people) countries.

ICL state “the infection fatality ratio (IFR) is a key statistic for estimating the burden of COVID-19”, so perhaps higher means more burden; higher risk of death due to greater population of elderly.

Using ICL’s article values to calibrate our understanding, we can position Bangkok’s IFR and nationally, Thailand’s IFR within their range of “high to low income”.

Crucially, the IFR let’s us estimate the Number of Infected Cases.

The WHO’s article show an IFR calculation, similarly to how I have described in the equation above, though, with interpretation. As stated, we lack both (i) Age distribution of deaths data and (ii) random samples of test data, in order to calculate this precisely. The WHO article also indicates the difference between measured Case Fatality Rate and Infection Fatality Ratio, as differentiated by antibody testing. As far as I am aware, in Thailand’s Publicly available data, there is no available serological testing data from random samples (please leave a comment if you have access to such data).

## LEFTHAND Side Plot: A weaker Estimate

On the lefthand side plot, we’re showing gaining an view on reporting of deaths. By taking the mean and min/max of the Expected IFR Death Distribution, we gain a ballpark figure for where we expect our green line (estimated age distribution of deaths) to fall between. It doesn’t because we do not have the true patient death ages and the (case/death) numbers are too small to be accurately representative. But there is some redemption.

The true distance of the (i) recorded province Deaths sum**  from the (ii) Expected IFR Mean average sum** is the large +/- number within each of the province plots (**sum over age distributions). This difference is most interesting, because the Expected IFR Mean is directly from the larger (trusted) population. This difference from Recorded to Trusted is a good measure for indicating whether the values are above or below the globally expected deaths; and are relative per province.

However, this distance measure is calculated with (dependent on) actual Recorded Case numbers, and not Infected Case numbers; which affects how we might use this number.

## Limitations:

The weakness here is in the green line. It shows an estimate of deaths over the continuous age range. We don’t have the actual COVID-19 death ages per province, at this time. In many provinces, no deaths have been recorded, so the age distribution data (death by age distribution), has a second degree of imprecision (the green line is flat). In the righthand side plot only, for provinces without any deaths data, we have no information on their IFR; and many provinces are in this condition.

## Conclusions:

The righthand side plot the IFR is shown. These IFR figures correspond approximately within the expected range published by ICL. According to their IFR ratings, Bangkok falls into the upper half of “low income” (more young people) countries and nationally, Thailand falls to the lower 3rd of “low income” countries.

Most importantly, the IFR let’s us estimate the Number of Infected Cases per province and across Thailand, to decide whether the Recorded Cases represent all the cases out there to find. On this front, Thailand is nationally doing okay (above the estimated mean).

In the lefthand side plot, the Expected IFR and differences from it are a useful guide for whether nationally, and in each province, the reported deaths match global expectations. One should consider the three presented measures of difference: mean average, min and max. The “true” value is somewhere near those ranges.

Considering all the caveats, these results indicate that Thailand is nationally +69.7 deaths (from Expected IFR Mean), and sitting at the maximum bound. We might interpret this as the reported deaths are matching those expected at the upper limit. Too many deaths? Possibly. But perhaps more likely is that this *might* indirectly indicate that case finding has sufficiently led us towards the global expected death rates. Which is good for the staff working on case finding and reporting. However, I am not tying myself to these conclusions, this is “back of envelope” stuff.

Without the age of deaths, the sums are slightly miscalculated; even though we have corrected for (i) province age distribution in IFR and (i) and (ii) Expected IFR risk % in the deaths projection estimates. The lacking evidence of serological testing data from random sampling also puts in question, whether these figures would be prone to further change given complete information.

(**Disclaimer: I do not claim to be an expert on the intricacies of COVID-19 infection metrics or public health data. I can guarantee the process and code is accurate, the presentation of IFR will be improved in the near future and Infected vs Recorded Case will be shown. Thanks for reading. Thanks to Dylan Jay and Matt Greenfield for the collaboration.)

## Extension of Python Dict .get() – Lookup with Similarity for Built-in Libraries

This is a prospective extension to Python dict .get() that solves a common problem in data applications. The bold proposal asks whether to include such an implementation in the core language or in a library, across languages used for data processing. See what you think..

#### Background & Why?

These days we have more data-oriented code being written (ML/AI,etc). Data is often “dirty” (missing values/spelling errors/grammar typos/etc). “Fuzzy” (less certain) matching can be useful in many of these cases (and traditionally in SQL we might use `%LIKE%`). Dictionary implementations (i.e. hashmap, hashtable, associative-array, etc) are an efficient lookup mechanism. They respond as a `boolean` lookup – the key is there, or the key is not; however it’s unconventional to think of their lookup with a `confidence` measure. In data-oriented code, dictionaries are often used for matching data or conditionally `joining` datasets. When we cross (natural) languages we get more typo variations (in the general sense, double languages means double the varieties of typos) and therefore greater likelihood of mismatches when performing lookups (or translating) across those languages.

#### Code Description:

Below is `Python` code using the `difflib` string similarity library. The code will perform a lookup in a dictionary (`dict`), using a double `get_or_else` mechanism. `get_or_else` has become a broadly adopted functional paradigm best practice in software engineering in order to replace `if-else` blocks with a (coupled) `curried` function. Coupling multiple `get_or_else` function calls is normal, yet tends to give more edge cases / complicates testing. It remains unconventional to throw `confidence`-based matching into this mix; which is precisely what we do here:

The dictionary lookup will either:

1. match the `key`, or
2. match the `key with a similarity score >= threshold=0.5`, or
3. fail to match, and return `default_value`.

## Code Breakdown

Obviously, the above looks like code golf, so let’s step through the call and show each operation:

#### (0) Standard Get or Else `dict` lookup:

The base operation is a standard `dict` key check using the `get(..)` method. `get(..)` ensures an exception is not raised if the `key` is not found. If the `key` is found it returns. (See the Runtime Analysis below for more on this execution, as it is evaluated after snippets 1-6).

#### (1) Similarity Scores:

Create a list of matches to each key. This is an O(n) operation, checking every `dict` key.

#### (2) Filter by Threshold Score:

Keep the keys with a similarity score that reaches or exceeds the `threshold` value. This is an O(n) operation, checking every `dict` key.

#### (3) Sort to find the best match:

Sort the filtered results, so the best result is in index position `0`. This is an O(n) to O(n log n) operation (for Timsort: i.e. Insertion or Merge).

#### (4) Get the top match or Handle if there are no matches. Ensure a value is returned:

We’re using `float('nan')` here because it should never unintentionally match a genuine key. Python doesn’t have a true `null` type (i.e. `None == None` is True, which is not the case for `null`) . `float('nan')` provides that `null` behaviour. This is an O(1) operation.

#### (5) Extract the key value (i.e. from `(key,score)` tuple) or handle no matches:

Same reasoning applies for `float('nan')` to ensure the `null` result it will not match an existing dictionary key. This is an O(1) operation.

#### (6) Second-Level Get_or_Else Lookup:

A simple get_or_else lookup. Note, that if a `top_match_key` was not found, then its value will be `float('nan')`, which will not match. Therefore, it will fail and return the specified `default_value`. This is an O(1) operation.

## Runtime Analysis

Total time complexity is O(3n+3c) for average, worst and best case scenarios (excluding variations in dependent functions, e.g. Timsort). Comparatively, `dict`‘s native lookup is O(1).

In the Appendix: Lazy Implementation section below, you can find a time complexity of O(3n+3c) for average and worst case scenarios and best as O(1), by separating the `boolean` and `confidence`-based lookups into (`curried`, yet) separate function calls.

## Appendix:

#### Appendix: Lazy implementation

This implementation improves the best case execution time. In this case the similarity lookup is optional, and lazily called. If `key` is not found within the dictionary, `get_or_threshold_match_lazy()` will return a function (object) pointer, which can then be called. Note: the major difference in this function is on line 11.

Why? Well, the eager implementation (above) will first evaluate the O(3n+3c) lookup, then it will try the O(1) lookup. The lazy implementation, will first evaluate the O(1) lookup, then it will wait for evaluation of the O(3n+3c) lookup.

Pros:

• Time complexity best case is O(1). Still average and worst is O(3n+3c).
• An option to separate function calls, and conditionally request the secondary function.
• Good for large lookup dictionaries.

Cons:

• More complex code to write / read.

#### Appendix: Eager implementation (above) Pros & Cons

Pros:

• Relatively simple code to write.

Cons:

• Bad for dictionaries with large key sets.
• Guaranteed O(3n+3c) for dictionary lookups.

## Thailand Province Border Adjacency Dataset/Code

A quick update post to help get my latest project’s new dataset more readily indexed on Google search, etc. (Feb 8th 2021)

I’ve recently been working on risk assessment for COVID-19 in our 2nd wave. To create an email alert per province (taking account of local regional data) I needed to join provincial data together. It turns out that for much of Thailand’s publicly available government datasets (particularly in Office of Agricultural Economics, Land Department, etc) the data is summarised at Province level (i.e. is not GIS coordinate-based). Yet, there’s no mapping of province -> [neighbouring provinces] dataset out there (that I could find), so I created one the other night and wrote the code to verify and integrate it.

That dataset/code is now on github: https://github.com/pmdscully/thailand_province_border_adjacency

An obligatory requirement of using data relations (X->Y) is making a pretty visualisation on GraphViz, so dutifully — here it is: ^^ (Along with Wikipedia’s provincial public map for comparison..)

## Q & A

Is it correct & up to date? Yes. The newest Thai province change was adding Bueng Kan, which was split-off from Nong Khai, effective on 23 March 2011 – that’s included; so it’s up-to-date as of Feb 2021. Bangkok is referred to as a Special Administrative Area, but it’s included as province in the mappings; giving a total of 77 entries.

Is it easy to use the mapping dataset by importing a Python module into my own software application? Yes, you can join province datasets together based on their semantic geo-neighbourhoods – 🙂

1. Just `git clone` the repository,
2. `download a province naming dataset` ,
3. `import the python module`,
4. Write about 4 lines of code gives you a dictionary lookup (see the readme.md for full details).

I want to SQL join my provincial datasets together, but only for the provinces nextdoor, how can I do that? Yes, that’s precisely what this dataset and code is for. Before you create your SQL query,

1. import the Python module (`province_neighbours.py`),
2. instantiate the ProvinceRelationsParser object,
3. get the dictionary,
4. perform the dictionary lookup on your key province, this will give you the list of neighbouring provinces.
5. Simply plug those names into your SQL query and you are ready! (Find a code example in the readme.md).

Can I use Thai language (UTF-8) as my lookup and get neighbour results in Thai (UTF-8)? Short answer is yes. See the readme.md on the Github repo for full details with code samples.

## Over to you

There’s plenty more to say about this project, but if you’re interested in the details, go visit the Github repository. (Or send me a message, if you want extra detailed info).

Feel free to check it out.

## Towards COVID-19 Wave Risk Assessment Tool for BKK Residents: Results so far…

Dated 25th Jan 2021

## (A) New Cases for Bangkok and Nearby Provinces:

All data collected from Daily COVID-19 report, Thailand information [Daily COVID-19 cases reported]
Data Service: https://opendata.data.go.th/dataset/covid-19-daily
Last Updated: 24 มกราคม 2564

0 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-25:
25 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-24:
5 Key Clusters with 21 Cases in กรุงเทพมหานคร / Bangkok on 2021-01-24 (excluding state quarantine and arrivals ASQ/ALQ)
1 Days Since Last New Case

0 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-25:
12 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-24:
4 Key Clusters with 12 Cases in สมุทรปราการ / Samut Prakan on 2021-01-24 (excluding state quarantine and arrivals ASQ/ALQ)
1 Days Since Last New Case

0 New Cases in นนทบุรี / Nonthaburi on 2021-01-25:
1 New Cases in นนทบุรี / Nonthaburi on 2021-01-24:
1 Key Clusters with 1 Cases in นนทบุรี / Nonthaburi on 2021-01-24 (excluding state quarantine and arrivals ASQ/ALQ)
1 Days Since Last New Case

0 New Cases in ปทุมธานี / Pathum Thani on 2021-01-25:
0 New Cases in ปทุมธานี / Pathum Thani on 2021-01-24:
0 New Cases in ปทุมธานี / Pathum Thani on 2021-01-23:
6 New Cases in ปทุมธานี / Pathum Thani on 2021-01-22:
2 Key Clusters with 6 Cases in ปทุมธานี / Pathum Thani on 2021-01-22 (excluding state quarantine and arrivals ASQ/ALQ)
3 Days Since Last New Case

0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-25:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-24:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-23:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-22:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-21:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-20:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-19:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-18:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-17:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-16:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-15:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-14:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-13:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-12:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-11:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-10:
1 New Cases in นครปฐม / Nakhon Pathom on 2021-01-09:
1 Key Clusters with 1 Cases in นครปฐม / Nakhon Pathom on 2021-01-09 (excluding state quarantine and arrivals ASQ/ALQ)
16 Days Since Last New Case

0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-25:
0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-24:
0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-23:
2 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-22:
1 Key Clusters with 2 Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-22 (excluding state quarantine and arrivals ASQ/ALQ)
3 Days Since Last New Case

0 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-25:
147 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-24:
1 Key Clusters with 147 Cases in สมุทรสาคร / Samut Sakhon on 2021-01-24 (excluding state quarantine and arrivals ASQ/ALQ)
1 Days Since Last New Case

0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-25:
7 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-24:
1 Key Clusters with 7 Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-24 (excluding state quarantine and arrivals ASQ/ALQ)
1 Days Since Last New Case

## (B) Last 14 days BANGKOK AND NEARBY PROVINCES:

All data collected from Daily COVID-19 report, Thailand information [Daily COVID-19 cases reported]
Data Service: https://opendata.data.go.th/dataset/covid-19-daily
Last Updated: 24 มกราคม 2564

0 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-25:
25 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-24:
14 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-23:
22 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-22:
23 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-21:
18 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-20:
22 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-19:
24 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-18:
16 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-17:
22 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-16:
36 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-15:
21 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-14:
28 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-13:
38 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-12:
46 New Cases in กรุงเทพมหานคร / Bangkok on 2021-01-11:

0 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-25:
12 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-24:
2 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-23:
3 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-22:
5 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-21:
3 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-20:
1 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-19:
3 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-18:
1 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-17:
3 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-16:
14 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-15:
6 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-14:
17 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-13:
13 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-12:
9 New Cases in สมุทรปราการ / Samut Prakan on 2021-01-11:

0 New Cases in นนทบุรี / Nonthaburi on 2021-01-25:
1 New Cases in นนทบุรี / Nonthaburi on 2021-01-24:
1 New Cases in นนทบุรี / Nonthaburi on 2021-01-23:
3 New Cases in นนทบุรี / Nonthaburi on 2021-01-22:
2 New Cases in นนทบุรี / Nonthaburi on 2021-01-21:
0 New Cases in นนทบุรี / Nonthaburi on 2021-01-20:
2 New Cases in นนทบุรี / Nonthaburi on 2021-01-19:
2 New Cases in นนทบุรี / Nonthaburi on 2021-01-18:
2 New Cases in นนทบุรี / Nonthaburi on 2021-01-17:
0 New Cases in นนทบุรี / Nonthaburi on 2021-01-16:
1 New Cases in นนทบุรี / Nonthaburi on 2021-01-15:
3 New Cases in นนทบุรี / Nonthaburi on 2021-01-14:
2 New Cases in นนทบุรี / Nonthaburi on 2021-01-13:
0 New Cases in นนทบุรี / Nonthaburi on 2021-01-12:
36 New Cases in นนทบุรี / Nonthaburi on 2021-01-11:

0 New Cases in ปทุมธานี / Pathum Thani on 2021-01-25:
0 New Cases in ปทุมธานี / Pathum Thani on 2021-01-24:
0 New Cases in ปทุมธานี / Pathum Thani on 2021-01-23:
6 New Cases in ปทุมธานี / Pathum Thani on 2021-01-22:
4 New Cases in ปทุมธานี / Pathum Thani on 2021-01-21:
1 New Cases in ปทุมธานี / Pathum Thani on 2021-01-20:
0 New Cases in ปทุมธานี / Pathum Thani on 2021-01-19:
1 New Cases in ปทุมธานี / Pathum Thani on 2021-01-18:
0 New Cases in ปทุมธานี / Pathum Thani on 2021-01-17:
4 New Cases in ปทุมธานี / Pathum Thani on 2021-01-16:
2 New Cases in ปทุมธานี / Pathum Thani on 2021-01-15:
1 New Cases in ปทุมธานี / Pathum Thani on 2021-01-14:
15 New Cases in ปทุมธานี / Pathum Thani on 2021-01-13:
5 New Cases in ปทุมธานี / Pathum Thani on 2021-01-12:
1 New Cases in ปทุมธานี / Pathum Thani on 2021-01-11:

0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-25:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-24:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-23:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-22:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-21:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-20:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-19:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-18:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-17:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-16:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-15:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-14:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-13:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-12:
0 New Cases in นครปฐม / Nakhon Pathom on 2021-01-11:

0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-25:
0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-24:
0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-23:
2 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-22:
0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-21:
1 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-20:
0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-19:
1 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-18:
1 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-17:
0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-16:
0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-15:
0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-14:
0 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-13:
3 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-12:
3 New Cases in พระนครศรีอยุธยา / Ayutthaya on 2021-01-11:

0 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-25:
147 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-24:
163 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-23:
217 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-22:
29 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-21:
27 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-20:
138 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-19:
320 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-18:
335 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-17:
165 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-16:
99 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-15:
208 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-14:
35 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-13:
176 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-12:
80 New Cases in สมุทรสาคร / Samut Sakhon on 2021-01-11:

0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-25:
7 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-24:
5 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-23:
0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-22:
0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-21:
0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-20:
0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-19:
0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-18:
0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-17:
0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-16:
0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-15:
0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-14:
1 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-13:
0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-12:
0 New Cases in สมุทรสงคราม / Samut Songkhram on 2021-01-11:

## What is it?

Above are (a) results that answer 3 research questions about assessing risk of organising events and expected risk of infection when visiting public spaces and (b) results of cases in the past 14 days. The data source is the (mostly up-to-date) COVID-19-Daily open dataset on opendata.go.th published by the Digital Government Agency (DGA), with this dataset maintained by the Department of Disease Control (DDC), each are an arm of Thailand’s government. The currently available official data services [1], [2], [3] and the map (currently 11 days out-of-date) at [4] do not give a simple way to assess risk for specific events at venues or places within a province or district. This project aims to provide public with a measure of risk for fine-grained activities planning, with special focus on locally-adjacent provinces.

### Research Questions

1. By province, how many new cases are announced?
2. By province, how many days since last new case was announced?
3. By province, how many “key” clusters exist (excluding state quarantine and arrivals ASQ/ALQ) and how many cases are there?

Specific column data used: “announce_date”, “province_of_isolation”, “risk”.

### Future Work for Risk Assessment:

• By province, itemise the key clusters and their number of new cases over the past X days.
• (3,7,14, risk of new cases lowers after 14 days).

### What are other countries doing to deliver risk assessment to public?

Other risk assessment metrics exist (e.g. covidactnow.org or panditpranav), which take account of testing and vaccination data to give a probability of infection risk. I’m not yet aware of whether that data is released by Thailand’s Digital Government Agency (DGA), but that’s possible for the future.

## Next, Get Involved? / What is Next To DO?

In the near future, I will try to make use of the district, as well as province information to help get a better sense of risk levels and risk-place associations; yet any public and busy spaces nearby can easily be considered “at risk”. Certainly, Future Work (see subheading) includes adding the accumulated key clusters over the past 3, 7, 14 days. This will give a good sense of gradually lowering risk for a province, which can be accumulated with the adjacent province data too. I’ll aim to add those and maybe some more.. If you have ideas that can improve on this, feel welcome to say.

For me personally, I would like to see a daily morning email (or real-time alert) in my inbox, so I’ll look into making this an email subscription service. If that’s interesting too, just let me know if you’d like to be added to the list.

## ON MEASURING MACHINE LEARNING MODELS AGAINST CONCRETE BUSINESS OBJECTIVES

##### REVIEW NOTES: DATA SCIENCE FOR BUSINESS BY PROVOST & FAWCETT: CHAPTER 7

I enjoyed reading this chapter. It’s insightful and well explained with detailed examples, diagrams and graphics, on a few data science topics that correspond directly to conventional scientific research in computer science. That makes me happy, because these are crucial points, yet rarely are the focus of Kaggle Competitions, books on Machine Learning or Statistics, the latest and greatest in TensorFlow, PyTorch, AutoML libraries (etc, etc) and too infrequently discussed in DL/AI/ML social posts and blogs. Below I have written about the points that are well worth taking home. These topics are broadly on:

• Careful consideration of what is desired from data science results.
• Expected value as a key evaluation framework.
• Consideration of appropriate comparative baselines, in machine learning models.

In Ubuntu (Debian/CentOS, and the like) `apt` is our go to CLI application package installer. It handles everything in a single iconic command that every Linux user knows:

`sudo apt install <packageName>`

Sometimes, and I still don’t get why or when, a package’s shared library (dependency) is not installed.

For example, this happened today for me with MySQL-Workbench. I run it on the CLI and it shows a dependent library is missing or can’t be found, and up throws an error message like:

`\$ mysql-workbench`
`/usr/lib/mysql-workbench/mysql-workbench-bin: error while loading shared libraries: libgdkmm-2.4.so.1: cannot open shared object file: No such file or directory`

## key points of THE FIX are:

1. Ensure the `GNU locate` database(s) (e.g. mlocate, slocate), are up to date with current information about file locations.

`sudo updatedb`

2. Ensure the file exists. (No print out means no file found)

`locate libgdkmm-2.4.so.1`

3. Reinstall the file if missing.
Here using `-f` for force install dependencies,, and `--reinstall` for force reinstall (if already installed).

`sudo apt-get install -f --reinstall libgtkmm-2.4-1v5`

4. Ensure the application configuration is looking in the correct location for the shared library files.

Today I didn’t need this. But essentially, to run MySQL-Workbench on Ubuntu uses a `!#/bin/bash` ELF file containing a script of commands to execute prior to starting the application binary. In that script, the following environment variables can be used to define the configuration locations `export MWB_BINARIES_DIR=`xyz and `export LD_LIBRARY_PATH=xyz`.

In my case, the application script use those environment variable values on the line that executes the runtime binary, as linker library address(es) to the corresponding shared library files written in C/C++. In interpreted language applications those environment variables values might be used as environment arguments into the executed code (for example in Python) or as library classpath addresses on the runtime execution line (for example in JIT-Java). Alas, I didn’t need to change those locations from default, but that’s how it works.

That’s how to resolve missing shared library dependencies in Ubuntu (and Debian/CentOS, etc).

## On Measuring the Senior, In Senior Software Engineering Roles

Labelling “Senior”, “Mid” and “Junior” roles of software engineers comes up from time to time in the developer and programmer forums. While I’m not a fan of labels for people or groups of people – Seniority and Skill/Knowledge/Ability Levels get to me because they are so ambiguous. So it is down to us to contribute and discuss to reach a clear definition.

A truth of seniority, across all genres, is group-wide effect. It’s leadership, it’s empathy, it’s improving the individuals and the group as a whole for the group’s common interest. It’s a positive improvement, it’s team-wide developer productivity and overall business-wide productivity improvement. But what does that mean for Developers and Software Engineers?

Continue reading “On Measuring the Senior, In Senior Software Engineering Roles”

## Adding to the Conversation on Data Science Training: Looking into the Future

2020 May 9th.

Please note: this is a temporally relevant article – it’s likely to be wrong immediately after it was written, however I publish it as it marks a step of the process. The below is my response to an expression of consideration on how to teach and how to learn Data Science in order to be most effective (as an employee, as a service to businesses and as a service to society). A key aspect raised during the discussion was on the consequences of focus upon domain expertise and of focus upon technical expertise, and of focus spread between both areas of expertise. My reply below (I believe adds a valuable addition and) helps guide the definition of Data Science teaching, learning and the ongoing strategy involved in continuous “lifelong” learning; or as long as Data Science remains as it is. I concede that the view presented below could easily have included many of other influencers to guide the viewpoint, more citations, viewpoints, argument points, evidence examples. But this is the nature of conversation imposed by a time limit. So, here goes:

Continue reading “Adding to the Conversation on Data Science Training: Looking into the Future”

## Book review 2/2 on Robot Proof: Higher Education in the Age of AI

I finished the book by Joseph Aoun a little while ago, and I’ve been sitting on my notes letting them stir. I think i have a fairly safe conclusion for its second half. That said, I would expect those with an understanding and empathetic relationship with their CS students and their families will have been at the cusp of some similar conclusions drawn by Aoun in Robot Proof in 2017.

Continue reading “Book review 2/2 on Robot Proof: Higher Education in the Age of AI”