Johan Pensar

Førsteamanuensis - Statistikk og Data Science

$Bilde av Johan Pensar$

English version of this page

E-post johanpen@math.uio.no

Rom NHA 814

Brukernavn

Besøksadresse Moltke Moes vei 35 Niels Henrik Abels hus 0851 Oslo

Postadresse Postboks 1053 Blindern 0316 Oslo

Andre tilknytninger Institutt for pedagogikk (Student)

Last ned visittkort

Du kan finne mer informasjon på min engelske side.

Emneord: Statistikk, data science, sannsynlighetsmodellering

Pavlović, Milena; al Hajj, Ghadi; Kanduri, Chakravarthi; Pensar, Johan; Wood, Mollie Elizabeth & Sollid, Ludvig Magne [Vis alle 8 forfattere av denne artikkelen] (2024). Improving generalization of machine learning-identified biomarkers using causal modelling with examples from immune receptor diagnostics. Nature Machine Intelligence. 6(1), s. 15–24. doi: 10.1038/s42256-023-00781-8.
Hjort, Anders Dahl; Scheel, Ida; Sommervoll, Dag Einar & Pensar, Johan (2023). Locally interpretable tree boosting: An application to house price prediction. Decision Support Systems. ISSN 0167-9236. 178. doi: 10.1016/j.dss.2023.114106. Fulltekst i vitenarkiv
al Hajj, Ghadi; Pensar, Johan & Sandve, Geir Kjetil Ferkingstad (2023). DagSim: Combining DAG-based model structure with unconstrained data types and relations for flexible, transparent, and modularized data simulation. PLOS ONE. ISSN 1932-6203. 18(4). doi: 10.1371/journal.pone.0284443. Fulltekst i vitenarkiv
Corander, Jukka; Hanage, William P & Pensar, Johan (2022). Causal discovery for the microbiome. Lancet Microbe. ISSN 2666-5247. 3(11), s. e881–e887. doi: 10.1016/S2666-5247(22)00186-0. Fulltekst i vitenarkiv Vis sammendrag
Measurement and manipulation of the microbiome is generally considered to have great potential for understanding the causes of complex diseases in humans, developing new therapies, and finding preventive measures. Many studies have found significant associations between the microbiome and various diseases; however, Koch's classical postulates remind us about the importance of causative reasoning when considering the relationship between microbes and a disease manifestation. Although causal discovery in observational microbiome data faces many challenges, methodological advances in causal structure learning have improved the potential of data-driven prediction of causal effects in large-scale biological systems. In this Personal View, we show the capability of existing methods for inferring causal effects from metagenomic data, and we highlight ways in which the introduction of causal structures that are more flexible than existing structures offers new opportunities for causal reasoning. Our observations suggest that microbiome research can further benefit from tools developed in the past 5 years in causal discovery and learn from their applications elsewhere.
Hjort, Anders Dahl; Pensar, Johan; Scheel, Ida & Sommervoll, Dag Einar (2022). House price prediction with gradient boosted trees under different loss functions. Journal of Property Research. ISSN 0959-9916. 39(4), s. 333–364. doi: 10.1080/09599916.2022.2070525. Fulltekst i vitenarkiv Vis sammendrag
Many banks and credit institutions are required to assess the value of dwellings in their mortgage portfolio. This valuation often relies on an Automated Valuation Model (AVM). Moreover, these institutions often report the models accuracy by two numbers: The fraction of predictions within ±20% and ±10% range from the true values. Until recently, AVMs tended to be hedonic regression models, but lately machine learning approaches like random forest and gradient boosted trees have been increasingly applied. Both the traditional approaches and the machine learning approaches rely on minimising mean squared prediction error, and not the number of predictions in the ±20% and ±10% range. We investigate whether introducing a loss function closer to the AVMs actual loss measure improves performance in machine learning approaches, specifically for a gradient boosted tree approach. This loss function yields an improvement from 89.4% to 90.0% of predictions within ±20% of the true value on a data set of N=126719 transactions from the Norwegian housing market between 2013 and 2015, with the biggest improvements in performance coming from the lower price segments. We also find that a weighted average of models with different loss functions improves performance further, yielding 90.4% of the observations within ±20% of the true value.
Pavlović, Milena; Scheffer, Lonneke; Motwani, Keshav; Kanduri, Chakravarthi; Kompova, Radmila & Vazov, Nikolay Aleksandrov [Vis alle 41 forfattere av denne artikkelen] (2021). The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires. Nature Machine Intelligence. 3(11), s. 936–944. doi: 10.1038/s42256-021-00413-z.
Chewapreecha, Claire; Pensar, Johan; Chattagul, Supaksorn; Pesonen, Maiju; Sangphukieo, Apiwat & Boonklang, Phumrapee [Vis alle 18 forfattere av denne artikkelen] (2021). Co-evolutionary Signals Identify Burkholderia pseudomallei Survival Strategies in a Hostile Environment. Molecular Biology and Evolution (MBE). ISSN 0737-4038. 39(1). doi: 10.1093/molbev/msab306. Fulltekst i vitenarkiv
Suotsalo, Kimmo; Xu, Yingying; Corander, Jukka & Pensar, Johan (2021). High-dimensional structure learning of sparse vector autoregressive models using fractional marginal pseudo-likelihood. Statistics and computing. ISSN 0960-3174. 31(73). doi: 10.1007/s11222-021-10049-z. Fulltekst i vitenarkiv
Mageiros, Leonardos; Meric, Guillaume; Bayliss, Sion; Pensar, Johan; Pascoe, Ben & Mourkas, Evangelos [Vis alle 19 forfattere av denne artikkelen] (2021). Genome evolution and the emergence of pathogenicity in avian Escherichia coli. Nature Communications. ISSN 2041-1723. 12(1). doi: 10.1038/s41467-021-20988-w. Fulltekst i vitenarkiv
Viinikka, Jussi; Hyttinen, Antti; Pensar, Johan & Koivisto, Mikko (2020). Towards Scalable Bayesian Learning of Causal DAGs. Advances in Neural Information Processing Systems. ISSN 1049-5258.
Tadei, Alessandro; Haajanen, Juulia; Pensar, Johan; Santtila, Pekka & Antfolk, Jan (2020). Counteracting deceptive responding in the Finnish Investigative Instrument of Child Sexual Abuse (FICSA). Journal of Sexual Aggression. ISSN 1355-2600. doi: 10.1080/13552600.2020.1846802. Fulltekst i vitenarkiv
Top, Janetta; Arredondo-Alonso, Sergio; Schürch, Anita C.; Puranen, Santeri; Pesonen, Maiju & Pensar, Johan [Vis alle 8 forfattere av denne artikkelen] (2020). Genomic rearrangements uncovered by genome-wide co-evolution analysis of a major nosocomial pathogen, Enterococcus faecium. Microbial Genomics. ISSN 2057-5858. 6(12), s. 1–8. doi: 10.1099/mgen.0.000488. Fulltekst i vitenarkiv

Se alle arbeider i Cristin

Sandve, Geir Kjetil Ferkingstad & Pensar, Johan (2022). Machine Learning and Causality.
Mageiros, Leonardos; Meric, Guillaume; Bayliss, Sion; Pensar, Johan; Pascoe, Ben & Mourkas, Evangelos [Vis alle 19 forfattere av denne artikkelen] (2021). Author Correction: Genome evolution and the emergence of pathogenicity in avian Escherichia coli (Nature Communications, (2021), 12, 1, (765), 10.1038/s41467-021-20988-w). Nature Communications. ISSN 2041-1723. 12(1). doi: 10.1038/s41467-021-22238-5.

Se alle arbeider i Cristin

Publisert 18. feb. 2020 08:47 - Sist endret 25. jan. 2023 10:59

Forskergrupper

Statistikk og data science