Logo

    Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02

    Die Universitätsbibliothek (UB) verfügt über ein umfangreiches Archiv an elektronischen Medien, das von Volltextsammlungen über Zeitungsarchive, Wörterbücher und Enzyklopädien bis hin zu ausführlichen Bibliographien und mehr als 1000 Datenbanken reicht. Auf iTunes U stellt die UB unter anderem eine Auswahl an Dissertationen der Doktorandinnen und Doktoranden an der LMU bereit. (Dies ist der 2. von 2 Teilen der Sammlung 'Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU'.)
    de84 Episodes

    People also ask

    What is the main theme of the podcast?
    Who are some of the popular guests the podcast?
    Were there any controversial topics discussed in the podcast?
    Were any current trending topics addressed in the podcast?
    What popular books were mentioned in the podcast?

    Episodes (84)

    Network-based analysis of gene expression data

    Network-based analysis of gene expression data
    The methods of molecular biology for the quantitative measurement of gene expression have undergone a rapid development in the past two decades. High-throughput assays with the microarray and RNA-seq technology now enable whole-genome studies in which several thousands of genes can be measured at a time. However, this has also imposed serious challenges on data storage and analysis, which are subject of the young, but rapidly developing field of computational biology. To explain observations made on such a large scale requires suitable and accordingly scaled models of gene regulation. Detailed models, as available for single genes, need to be extended and assembled in larger networks of regulatory interactions between genes and gene products. Incorporation of such networks into methods for data analysis is crucial to identify molecular mechanisms that are drivers of the observed expression. As methods for this purpose emerge in parallel to each other and without knowing the standard of truth, results need to be critically checked in a competitive setup and in the context of the available rich literature corpus. This work is centered on and contributes to the following subjects, each of which represents important and distinct research topics in the field of computational biology: (i) construction of realistic gene regulatory network models; (ii) detection of subnetworks that are significantly altered in the data under investigation; and (iii) systematic biological interpretation of detected subnetworks. For the construction of regulatory networks, I review existing methods with a focus on curation and inference approaches. I first describe how literature curation can be used to construct a regulatory network for a specific process, using the well-studied diauxic shift in yeast as an example. In particular, I address the question how a detailed understanding, as available for the regulation of single genes, can be scaled-up to the level of larger systems. I subsequently inspect methods for large-scale network inference showing that they are significantly skewed towards master regulators. A recalibration strategy is introduced and applied, yielding an improved genome-wide regulatory network for yeast. To detect significantly altered subnetworks, I introduce GGEA as a method for network-based enrichment analysis. The key idea is to score regulatory interactions within functional gene sets for consistency with the observed expression. Compared to other recently published methods, GGEA yields results that consistently and coherently align expression changes with known regulation types and that are thus easier to explain. I also suggest and discuss several significant enhancements to the original method that are improving its applicability, outcome and runtime. For the systematic detection and interpretation of subnetworks, I have developed the EnrichmentBrowser software package. It implements several state-of-the-art methods besides GGEA, and allows to combine and explore results across methods. As part of the Bioconductor repository, the package provides a unified access to the different methods and, thus, greatly simplifies the usage for biologists. Extensions to this framework, that support automating of biological interpretation routines, are also presented. In conclusion, this work contributes substantially to the research field of network-based analysis of gene expression data with respect to regulatory network construction, subnetwork detection, and their biological interpretation. This also includes recent developments as well as areas of ongoing research, which are discussed in the context of current and future questions arising from the new generation of genomic data.

    Context-based RNA-seq mapping

    Context-based RNA-seq mapping
    In recent years, the sequencing of RNA (RNA-seq) using next generation sequencing (NGS) technology has become a powerful tool for analyzing the transcriptomic state of a cell. Modern NGS platforms allow for performing RNA-seq experiments in a few days, resulting in millions of short sequencing reads. A crucial step in analyzing RNA-seq data generally is determining the transcriptomic origin of the sequencing reads (= read mapping). In principal, read mapping is a sequence alignment problem, in which the short sequencing reads (30 - 500 nucleotides) are aligned to much larger reference sequences such as the human genome (3 billion nucleotides). In this thesis, we present ContextMap, an RNA-seq mapping approach that evaluates the context of the sequencing reads for determining the most likely origin of every read. The context of a sequencing read is defined by all other reads aligned to the same genomic region. The ContextMap project started with a proof of concept study, in which we showed that our approach is able to improve already existing read mapping results provided by other mapping programs. Subsequently, we developed a standalone version of ContextMap. This implementation no longer relied on mapping results of other programs, but determined initial alignments itself using a modification of the Bowtie short read alignment program. However, the original ContextMap implementation had several drawbacks. In particular, it was not able to predict reads spanning over more than two exons and to detect insertions or deletions (indels). Furthermore, ContextMap depended on a modification of a specific Bowtie version. Thus, it could neither benefit of Bowtie updates nor of novel developments (e.g. improved running times) in the area of short read alignment software. For addressing these problems, we developed ContextMap 2, an extension of the original ContextMap algorithm. The key features of ContextMap 2 are the context-based resolution of ambiguous read alignments and the accurate detection of reads crossing an arbitrary number of exon-exon junctions or containing indels. Furthermore, a plug-in interface is provided that allows for the easy integration of alternative short read alignment programs (e.g. Bowtie 2 or BWA) into the mapping workflow. The performance of ContextMap 2 was evaluated on real-life as well as synthetic data and compared to other state-of-the-art mapping programs. We found that ContextMap 2 had very low rates of misplaced reads and incorrectly predicted junctions or indels. Additionally, recall values were as high as for the top competing methods. Moreover, the runtime of ContextMap 2 was at least two fold lower than for the best competitors. In addition to the mapping of sequencing reads to a single reference, the ContextMap approach allows the investigation of several potential read sources (e.g. the human host and infecting pathogens) in parallel. Thus, ContextMap can be applied to mine for infections or contaminations or to map data from meta-transcriptomic studies. Furthermore, we developed methods based on mapping-derived statistics that allow to assess confidence of mappings to identified species and to detect false positive hits. ContextMap was evaluated on three real-life data sets and results were compared to metagenomics tools. Here, we showed that ContextMap can successfully identify the species contained in a sample. Moreover, in contrast to most other metagenomics approaches, ContextMap also provides read mapping results to individual species. As a consequence, read mapping results determined by ContextMap can be used to study the gene expression of all species contained in a sample at the same time. Thus, ContextMap might be applied in clinical studies, in which the influence of infecting agents on host organisms is investigated. The methods presented in this thesis allow for an accurate and fast mapping of RNA-seq data. As the amount of available sequencing data increases constantly, these methods will likely become an important part of many RNA-seq data analyses and thus contribute valuably to research in the field of transcriptomics.

    Computing hybridization networks using agreement forests

    Computing hybridization networks using agreement forests
    Rooted phylogenetic trees are widely used in biology to represent the evolutionary history of certain species. Usually, such a tree is a simple binary tree only containing internal nodes of in-degree one and out-degree two representing specific speciation events. In applied phylogenetics, however, trees can contain nodes of out-degree larger than two because, often, in order to resolve some orderings of speciation events, there is only insufficient information available and the common way to model this uncertainty is to use nonbinary nodes (i.e., nodes of out-degree of at least three), also denoted as polytomies. Moreover, in addition to such speciation events, there exist certain biological events that cannot be modeled by a tree and, thus, require the more general concept of rooted phylogenetic networks or, more specifically, of hybridization networks. Examples for such reticulate events are horizontal gene transfer, hybridization, and recombination. Nevertheless, in order to construct hybridization networks, the less general concept of a phylogenetic tree can still be used as building block. More precisely, often, in a first step, phylogenetic trees for a set of species, each based on a distinctive orthologous gene, are constructed. In a second step, specific sets containing common subtrees of those trees, known as maximum acyclic agreement forests, are calculated, which are then glued together to a single hybridization network. In such a network, hybridization nodes (i.e., nodes of in-degree larger than or equal to two) can exist representing potential reticulate events of the underlying evolutionary history. As such events are considered as rare phenomena, from a biological point of view, especially those networks representing a minimum number of reticulate events, which is denoted as hybridization number, are of high interest. Consequently, in a mathematical aspect, the problem of calculating hybridization networks can be briefly described as follows. Given a set T of rooted phylogenetic trees sharing the same set of taxa, compute a hybridization network N displaying T with minimum hybridization number. In this context, we say that such a network N displays a phylogenetic tree T, if we can obtain T from N by removing as well as contracting some of its nodes and edges. Unfortunately, this is a computational hard problem (i.e., it is NP-hard), even for the simplest case given just two binary input trees. In this thesis, we present several methods tackling this NP-hard problem. Our first approach describes how to compute a representative set of minimum hybridization networks for two binary input trees. For that purpose, our approach implements the first non-naive algorithm - called allMAAFs - calculating all maximum acyclic agreement forests for two rooted binary phylogenetic trees on the same set of taxa. In a subsequent step, in order to maximize the efficiency of the algorithm allMAAFs, we have developed additionally several modifications each reducing the number of computational steps and, thus, significantly improving its practical runtime. Our second approach is an extension of our first approach making the underlying algorithm accessible to more than two binary input trees. For this purpose, our approach implements the algorithm allHNetworks being the first algorithm calculating all relevant hybridization networks displaying a set of rooted binary phylogenetic trees on the same set of taxa, which is a preferable feature when studying hybridization events. Lastly, we have developed a generalization of our second approach that can now deal with multiple nonbinary input trees. For that purpose, our approach implements the first non-naive algorithm - called allMulMAAFs - calculating a relevant set of nonbinary maximum acyclic agreement forests for two rooted (nonbinary) phylogenetic trees on the same set of taxa. Each of the algorithms above is integrated into our user friendly Java-based software package Hybroscale, which is freely available and platform independent, so that it runs on all major operating systems. Our program provides a graphical user interface for visualizing trees and networks. Moreover, it facilitates the interpretation of computed hybridization networks by adding specific features to its graphical representation and, thus, supports biologists in investigating reticulate evolution. In addition, we have implemented a method using a user friendly SQL-style modeling language for filtering the usually large amount of reported networks.

    Exploiting autobiographical memory for fallback authentication on smartphones

    Exploiting autobiographical memory for fallback authentication on smartphones
    Smartphones have advanced from simple communication devices to multipurpose devices that capture almost every single moment in our daily lives and thus contain sensitive data like photos or contact information. In order to protect this data, users can choose from a variety of authentication schemes. However, what happens if one of these schemes fails, for example, when users are not able to provide the correct password within a limited number of attempts? So far, situations like this have been neglected by the usable security and privacy community that mainly focuses on primary authentication schemes. But fallback authentication is comparably important to enable users to regain access to their devices (and data) in case of lockouts. In theory, any scheme for primary authentication on smartphones could also be used as fallback solution. In practice, fallback authentication happens less frequently and imposes different requirements and challenges on its design. The aim of this work is to understand and address these challenges. We investigate the oc- currences of fallback authentication on smartphones in real life in order to grasp the charac- teristics that fallback authentication conveys. We also get deeper insights into the difficulties that users have to cope with during lockout situations. In combination with the knowledge from previous research, these insights are valuable to provide a detailed definition of fall- back authentication that has been missing so far. The definition covers usability and security characteristics and depicts the differences to primary authentication. Furthermore, we explore the potential of autobiographical memory, a part of the human memory that relates to personal experiences of the past, for the design of alternative fall- back schemes to overcome the well-known memorability issues of current solutions. We present the design and evaluation of two static approaches that are based on the memory of locations and special drawings. We also cover three dynamic approaches that relate to re- cent smartphone activities, icon arrangements and installed apps. This series of work allows us to analyze the suitability of different types of memories for fallback authentication. It also helps us to extend the definition of fallback authentication by identifying factors that influence the quality of fallback schemes. The main contributions of this thesis can be summarized as follows: First, it gives essen- tial insights into the relevance, frequency and problems of fallback authentication on smart- phones in real life. Second, it provides a clear definition of fallback authentication to classify authentication schemes based on usability and security properties. Third, it shows example implementations and evaluations of static and dynamic fallback schemes that are based on different autobiographical memories. Finally, it discusses the advantages and disadvantages of these memories and gives recommendations for their design, evaluation and analysis in the context of fallback authentication.

    Cross-species network and transcript transfer

    Cross-species network and transcript transfer
    Metabolic processes, signal transduction, gene regulation, as well as gene and protein expression are largely controlled by biological networks. High-throughput experiments allow the measurement of a wide range of cellular states and interactions. However, networks are often not known in detail for specific biological systems and conditions. Gene and protein annotations are often transferred from model organisms to the species of interest. Therefore, the question arises whether biological networks can be transferred between species or whether they are specific for individual contexts. In this thesis, the following aspects are investigated: (i) the conservation and (ii) the cross-species transfer of eukaryotic protein-interaction and gene regulatory (transcription factor- target) networks, as well as (iii) the conservation of alternatively spliced variants. In the simplest case, interactions can be transferred between species, based solely on the sequence similarity of the orthologous genes. However, such a transfer often results either in the transfer of only a few interactions (medium/high sequence similarity threshold) or in the transfer of many speculative interactions (low sequence similarity threshold). Thus, advanced network transfer approaches also consider the annotations of orthologous genes involved in the interaction transfer, as well as features derived from the network structure, in order to enable a reliable interaction transfer, even between phylogenetically very distant species. In this work, such an approach for the transfer of protein interactions is presented (COIN). COIN uses a sophisticated machine-learning model in order to label transferred interactions as either correctly transferred (conserved) or as incorrectly transferred (not conserved). The comparison and the cross-species transfer of regulatory networks is more difficult than the transfer of protein interaction networks, as a huge fraction of the known regulations is only described in the (not machine-readable) scientific literature. In addition, compared to protein interactions, only a few conserved regulations are known, and regulatory elements appear to be strongly context-specific. In this work, the cross-species analysis of regulatory interaction networks is enabled with software tools and databases for global (ConReg) and thousands of context-specific (CroCo) regulatory interactions that are derived and integrated from the scientific literature, binding site predictions and experimental data. Genes and their protein products are the main players in biological networks. However, to date, the aspect is neglected that a gene can encode different proteins. These alternative proteins can differ strongly from each other with respect to their molecular structure, function and their role in networks. The identification of conserved and species-specific splice variants and the integration of variants in network models will allow a more complete cross-species transfer and comparison of biological networks. With ISAR we support the cross-species transfer and comparison of alternative variants by introducing a gene-structure aware (i.e. exon-intron structure aware) multiple sequence alignment approach for variants from orthologous and paralogous genes. The methods presented here and the appropriate databases allow the cross-species transfer of biological networks, the comparison of thousands of context-specific networks, and the cross-species comparison of alternatively spliced variants. Thus, they can be used as a starting point for the understanding of regulatory and signaling mechanisms in many biological systems.

    Mean field limits for charged particles

    Mean field limits for charged particles
    The aim of this thesis is to provide a rigorous mathematical derivation of the Vlasov-Poisson equation and the Vlasov-Maxwell equations in the large N limit of interacting charged particles. We will extend a method previously proposed by Boers and Pickl to perform a mean field limit for the Vlasov-Poisson equation with the full Coulomb singularity and an N-dependent cut-off decreasing as $N^{-1/3 + \epsilon}$. We will then discuss an alternative approach, deriving the Vlasov-Poisson equation as a combined mean field and point-particle limit of an N-particle Coulomb system of extended charges. Finally, we will combine both methods to prove a mean field limit for the relativistic Vlasov-Maxwell system in 3+1 dimensions. In each case, convergence of the empirical measures to solutions of the corresponding mean field equation can be shown for typical initial conditions. This implies, in particular, the propagation of chaos for the respective dynamics.

    Fostering awareness and collaboration in large-class lectures

    Fostering awareness and collaboration in large-class lectures
    For decades, higher education has been shaped by large-class lectures, which are characterized by large anonymous audiences. Well known issues of large-class lectures are a rather low degree of interactivity and a notable passivity of students, which are aggravated by the social environment created by large audiences. However, research indicates that an active involvement is indispensable for learning to be successful. Active partaking in lectures is thus often a goal of technology- supported lectures. An outstanding feature of social media is certainly their capabilities of facilitating interactions in large groups of participants. Social media thus seem to be a suitable basis for technology-enhanced learning in large-class lectures. However, existing general-purpose social media are often accompanied by several shortcomings that are assumed to hinder their proper use in lectures. This thesis therefore deals with the conception of a social medium, called Backstage, specially tailored for use in large-class lectures. Backstage provides both lecturer- as well as student-initiated communication by means of an Audience Response System and a backchannel. Audience Response Systems allow running quizzes in lectures, e.g., to assess knowledge, and can thus be seen as a technological support of question asking by the lecturer. These systems collect and aggregate the students' answers and report the results back to the audience in real-time. Audience Response Systems have shown to be a very effective means for sustaining lecture- relevant interactivity in lectures. Using a backchannel, students can initiate communication with peers or the lecturer. The backchannel is built upon microblogging, which has become a very popular communication medium in recent years. A key characteristic of microblogging is that messages are very concise, comprising only few words. The brief form of communication makes microblogging quite appealing for a backchannel in lectures. A preliminary evaluation of a first prototype conducted at an early stage of the project, however, indicated that a conventional digital backchannel is prone to information overload. Even a relatively small group can quickly render the backchannel discourse incomprehensible. This incomprehensibility is rooted in a lack of interactional coherence, a rather low communication efficiency, a high information entropy, and a lack of connection between the backchannel and the frontchannel, i.e., the lecture’s discourse. This thesis investigates remedies to these issues. To this aim, lecture slides are integrated in the backchannel to structure and to provide context for the backchannel discourse. The backchannel communication is revised to realize a collaborative annotation of slides by typed backchannel posts. To reduce information entropy backchannel posts have to be assigned to predefined categories. To establish a connection with the frontchannel, backchannel posts have to be stuck on appropriate locations on slides. The lecture slides also improve communication efficiency by routing, which means that the backchannel can filter such that it only shows the posts belonging to the currently displayed slide. Further improvements and modifications, e.g., of the Audience Response System, are described in this thesis. This thesis also reports on an evaluation of Backstage in four courses. The outcomes are promising. Students welcomed the use of Backstage. Backstage not only succeeded in increasing interactivity but also contributed to social awareness, which is a prerequisite of active participation. Furthermore, the backchannel communication was highly lecture-relevant. As another important result, an additional study conducted in collaboration with educational scientists was able to show that students in Backstage-supported lectures used their mobile devices to a greater extent for lecture-relevant activities compared to students in conventional lectures, in which mobile devices were mostly used for lecture-unrelated activities. To establish social control of the backchannel, this thesis investigates rating and ranking of backchannel posts. Furthermore, this thesis proposes a reputation system that aims at incentivizing desirable behavior in the backchannel. The reputation system is based on an eigenvector centrality similar to Google's PageRank. It is highly customizable and also allows considering quiz performance in the computation of reputation. All these approaches, rating, ranking as well as reputation systems have proven to be very effective mechanisms of social control in general-purpose social media.

    Modeling the dynamics of large conditional heteroskedastic covariance matrices

    Modeling the dynamics of large conditional heteroskedastic covariance matrices
    Many economic and financial time series exhibit time-varying volatility. GARCH models are tools for forecasting and analyzing the dynamics of this volatility. The co-movements in financial markets and financial assets around the globe have recently become the main area of interest of financial econometricians; hence, multivariate GARCH models have been introduced in order to capture these co-movements. A large variety of multivariate GARCH models exists in the financial world, and each of these models has its advantages and limitations. An important goal in constructing multivariate GARCH models is to make them parsimonious enough without compromising their adequacy in real-world applications. Another aspect is to ensure that the conditional covariance matrix is a positive-definite one. Motivated by the idea that volatility in financial markets is driven by a few latent variables, a new parameterization in multivariate context is proposed in this thesis. The factors in our proposed model are obtained through a recursive use of the singular value decomposition (SVD). This recursion enables us to sequentially extract the volatility clustering from the data set; accordingly, our model is called Sequential Volatility Extraction (SVX model in short). Logarithmically transformed singular values and the components of their corresponding singular vectors were modeled using the ARMA approach. We can say that in terms of basic idea and modeling approach our model resembles a stochastic volatility model. Empirical analysis and the comparison with the already existing multivariate GARCH models show that our proposed model is parsimonious because it requires lower number of parameters to estimate when compared to the two alternative models (i.e., DCC and GOGARCH). At the same time, the resulting covariance matrices from our model are positive-(semi)-definite. Hence, we can argue that our model fulfills the basic requirements of a multivariate GARCH model. Based on the findings, it can be concluded that SVX model can be applied to financial data of dimensions ranging from low to high.

    The asymptotic behavior of the term structure of interest rates

    The asymptotic behavior of the term structure of interest rates
    In this dissertation we investigate long-term interest rates, i.e. interest rates with maturity going to infinity, in the post-crisis interest rate market. Three different concepts of long-term interest rates are considered for this purpose: the long-term yield, the long-term simple rate, and the long-term swap rate. We analyze the properties as well as the interrelations of these long-term interest rates. In particular, we study the asymptotic behavior of the term structure of interest rates in some specific models. First, we compute the three long-term interest rates in the HJM framework with different stochastic drivers, namely Brownian motions, Lévy processes, and affine processes on the state space of positive semidefinite symmetric matrices. The HJM setting presents the advantage that the entire yield curve can be modeled directly. Furthermore, by considering increasingly more general classes of drivers, we were able to take into account the impact of different risk factors and their dependence structure on the long end of the yield curve. Finally, we study the long-term interest rates and especially the long-term swap rate in the Flesaker-Hughston model and the linear-rational methodology.

    Bayesian inference for infectious disease transmission models based on ordinary differential equations

    Bayesian inference for infectious disease transmission models based on ordinary differential equations
    Predicting the epidemiological effects of new vaccination programmes through mathematical-statistical transmission modelling is of increasing importance for the German Standing Committee on Vaccination. Such models commonly capture large populations utilizing a compartmental structure with its dynamics being governed by a system of ordinary differential equations (ODEs). Unfortunately, these ODE-based models are generally computationally expensive to solve, which poses a challenge for any statistical procedure inferring corresponding model parameters from disease surveillance data. Thus, in practice parameters are often fixed based on epidemiological knowledge hence ignoring uncertainty. A Bayesian inference framework incorporating this prior knowledge promises to be a more suitable approach allowing for additional parameter flexibility. This thesis is concerned with statistical methods for performing Bayesian inference of ODE-based models. A posterior approximation approach based on a Gaussian distribution around the posterior mode through its respective observed Fisher information is presented. By employing a newly proposed method for adjusting the likelihood impact in terms of using a power posterior, the approximation procedure is able to account for the residual autocorrelation in the data given the model. As an alternative to this approximation approach, an adaptive Metropolis-Hastings algorithm is described which is geared towards an efficient posterior sampling in the case of a high-dimensional parameter space and considerable parameter collinearities. In order to identify relevant model components, Bayesian model selection criteria based on the marginal likelihood of the data are applied. The estimation of the marginal likelihood for each considered model is performed via a newly proposed approach which utilizes the available posterior sample obtained from the preceding Metropolis-Hastings algorithm. Furthermore, the thesis contains an application of the presented methods by predicting the epidemiological effects of introducing rotavirus childhood vaccination in Germany. Again, an ODE-based compartmental model accounting for the most relevant transmission aspects of rotavirus is presented. After extending the model with vaccination mechanisms, it becomes possible to estimate the rotavirus vaccine effectiveness through routinely collected surveillance data. By employing the Bayesian framework, model predictions on the future epidemiological development assuming a high vaccination coverage rate incorporate uncertainty regarding both model structure and parameters. The forecast suggests that routine vaccination may cause a rotavirus incidence increase among older children and elderly, but drastically reduces the disease burden among the target group of young children, even beyond the expected direct vaccination effect by means of herd protection. Altogether, this thesis provides a statistical perspective on the modelling of routine vaccination effects in order to assist decision making under uncertainty. The presented methodology is thereby easily applicable to other infectious diseases such as influenza.

    Erfassung und Behandlung von Positionsfehlern in standortbasierter Autorisierung

    Erfassung und Behandlung von Positionsfehlern in standortbasierter Autorisierung
    Durch die immer größeren technischen Möglichkeiten mobiler Endgeräte sind die Voraussetzungen erfüllt, um diese zum mobilen Arbeiten oder zur Steuerung von industriellen Fertigungsprozessen einzusetzen. Aus Gründen der Informations- und Betriebssicherheit, sowie zur Umsetzung funktionaler Anforderungen, ist es aber vielfach erforderlich, die Verfügbarkeit von entsprechenden Zugriffsrechten auf Nutzer innerhalb autorisierter Zonen zu begrenzen. So kann z.B. das Auslesen kritischer Daten auf individuelle Büros oder die mobile Steuerung von Maschinen auf passende Orte innerhalb einer Fabrikhalle beschränkt werden. Dazu muss die Position des Nutzers ermittelt werden. Im realen Einsatz können Positionsschätzungen jedoch mit Fehlern in der Größe von autorisierten Zonen auftreten. Derzeit existieren noch keine Lösungen, welche diese Fehler in Autorisierungsentscheidungen berücksichtigen, um einhergehenden Schaden aus Falschentscheidungen zu minimieren. Ferner existieren derzeit keine Verfahren, um die Güteeigenschaften solcher Ortsbeschränkungen vor deren Ausbringung zu analysieren und zu entscheiden, ob ein gegebenes Positionierungssystem aufgrund der Größe seiner Positionsfehler geeignet ist. In der vorliegenden Arbeit werden deshalb Lösungen zur Erfassung und Behandlung solcher Positionsfehler im Umfeld der standortbasierten Autorisierung vorgestellt. Hierzu wird zunächst ein Schätzverfahren für Positionsfehler in musterbasierten Positionierungsverfahren eingeführt, das aus den Charakteristika der durchgeführten Messungen eine Verteilung für den Standort des Nutzers ableitet. Um hieraus effizient die Aufenthaltswahrscheinlichkeit innerhalb einer autorisierten Zone zu bestimmen, wird ein Algorithmus vorgestellt, der basierend auf Vorberechnungen eine erhebliche Verbesserung der Laufzeit gegenüber der direkten Berechnung erlaubt. Erstmals wird eine umfassende Gegenüberstellung von existierenden standortbasierten Autorisierungsstrategien auf Basis der Entscheidungstheorie vorgestellt. Mit der risikobasierten Autorisierungsstrategie wird eine neue, aus entscheidungstheoretischer Sicht optimale Methodik eingeführt. Es werden Ansätze zur Erweiterung klassischer Zugriffskontrollmodelle durch Ortsbeschränkungen vorgestellt, welche bei ihrer Durchsetzung die Möglichkeit von Positionsfehlern und die Konsequenzen von Falschentscheidungen berücksichtigen. Zur Spezifikation autorisierter Zonen werden Eigenschaftsmodelle eingeführt, die, im Gegensatz zu herkömmlichen Polygonen, für jeden Ort die Wahrscheinlichkeit modellieren, dort eine geforderte Eigenschaft zu beobachten. Es werden ferner Methoden vorgestellt, um den Einfluss von Messausreißern auf Autorisierungsentscheidungen zu reduzieren. Ferner werden Analyseverfahren eingeführt, die für ein gegebenes Szenario eine qualitative und quantitative Bewertung der Eignung von Positionierungssystemen erlauben. Die quantitative Bewertung basiert auf dem entwickelten Konzept der Autorisierungsmodelle. Diese geben für jeden Standort die Wahrscheinlichkeit an, dort eine Positionsschätzung zu erhalten, die zur Autorisierung führt. Die qualitative Bewertung bietet erstmals ein binäres Kriterium, um für ein gegebenes Szenario eine konkrete Aussage bzgl. der Eignung eines Positionierungssystems treffen zu können. Die Einsetzbarkeit dieses Analyseverfahrens wird an einer Fallstudie verdeutlicht und zeigt die Notwendigkeit einer solchen Analyse bereits vor der Ausbringung von standortbasierter Autorisierung. Es wird gezeigt, dass für typische Positionierungssysteme durch die entwickelten risikobasierten Verfahren eine erhebliche Reduktion von Schaden aus Falschentscheidungen möglich ist und die Einsetzbarkeit der standortbasierten Autorisierung somit verbessert werden kann.

    Exploiting prior knowledge and latent variable representations for the statistical modeling and probabilistic querying of large knowledge graphs

    Exploiting prior knowledge and latent variable representations for the statistical modeling and probabilistic querying of large knowledge graphs
    Large knowledge graphs increasingly add great value to various applications that require machines to recognize and understand queries and their semantics, as in search or question answering systems. These applications include Google search, Bing search, IBM’s Watson, but also smart mobile assistants as Apple’s Siri, Google Now or Microsoft’s Cortana. Popular knowledge graphs like DBpedia, YAGO or Freebase store a broad range of facts about the world, to a large extent derived from Wikipedia, currently the biggest web encyclopedia. In addition to these freely accessible open knowledge graphs, commercial ones have also evolved including the well-known Google Knowledge Graph or Microsoft’s Satori. Since incompleteness and veracity of knowledge graphs are known problems, the statistical modeling of knowledge graphs has increasingly gained attention in recent years. Some of the leading approaches are based on latent variable models which show both excellent predictive performance and scalability. Latent variable models learn embedding representations of domain entities and relations (representation learning). From these embeddings, priors for every possible fact in the knowledge graph are generated which can be exploited for data cleansing, completion or as prior knowledge to support triple extraction from unstructured textual data as successfully demonstrated by Google’s Knowledge-Vault project. However, large knowledge graphs impose constraints on the complexity of the latent embeddings learned by these models. For graphs with millions of entities and thousands of relation-types, latent variable models are required to exploit low dimensional embeddings for entities and relation-types to be tractable when applied to these graphs. The work described in this thesis extends the application of latent variable models for large knowledge graphs in three important dimensions. First, it is shown how the integration of ontological constraints on the domain and range of relation-types enables latent variable models to exploit latent embeddings of reduced complexity for modeling large knowledge graphs. The integration of this prior knowledge into the models leads to a substantial increase both in predictive performance and scalability with improvements of up to 77% in link-prediction tasks. Since manually designed domain and range constraints can be absent or fuzzy, we also propose and study an alternative approach based on a local closed-world assumption, which derives domain and range constraints from observed data without the need of prior knowledge extracted from the curated schema of the knowledge graph. We show that such an approach also leads to similar significant improvements in modeling quality. Further, we demonstrate that these two types of domain and range constraints are of general value to latent variable models by integrating and evaluating them on the current state of the art of latent variable models represented by RESCAL, Translational Embedding, and the neural network approach used by the recently proposed Google Knowledge Vault system. In the second part of the thesis it is shown that the just mentioned three approaches all perform well, but do not share many commonalities in the way they model knowledge graphs. These differences can be exploited in ensemble solutions which improve the predictive performance even further. The third part of the thesis concerns the efficient querying of the statistically modeled knowledge graphs. This thesis interprets statistically modeled knowledge graphs as probabilistic databases, where the latent variable models define a probability distribution for triples. From this perspective, link-prediction is equivalent to querying ground triples which is a standard functionality of the latent variable models. For more complex querying that involves e.g. joins and projections, the theory on probabilistic databases provides evaluation rules. In this thesis it is shown how the intrinsic features of latent variable models can be combined with the theory of probabilistic databases to realize efficient probabilistic querying of the modeled graphs.

    Complex queries and complex data

    Complex queries and complex data
    With the widespread availability of wearable computers, equipped with sensors such as GPS or cameras, and with the ubiquitous presence of micro-blogging platforms, social media sites and digital marketplaces, data can be collected and shared on a massive scale. A necessary building block for taking advantage from this vast amount of information are efficient and effective similarity search algorithms that are able to find objects in a database which are similar to a query object. Due to the general applicability of similarity search over different data types and applications, the formalization of this concept and the development of strategies for evaluating similarity queries has evolved to an important field of research in the database community, spatio-temporal database community, and others, such as information retrieval and computer vision. This thesis concentrates on a special instance of similarity queries, namely k-Nearest Neighbor (kNN) Queries and their close relative, Reverse k-Nearest Neighbor (RkNN) Queries. As a first contribution we provide an in-depth analysis of the RkNN join. While the problem of reverse nearest neighbor queries has received a vast amount of research interest, the problem of performing such queries in a bulk has not seen an in-depth analysis so far. We first formalize the RkNN join, identifying its monochromatic and bichromatic versions and their self-join variants. After pinpointing the monochromatic RkNN join as an important and interesting instance, we develop solutions for this class, including a self-pruning and a mutual pruning algorithm. We then evaluate these algorithms extensively on a variety of synthetic and real datasets. From this starting point of similarity queries on certain data we shift our focus to uncertain data, addressing nearest neighbor queries in uncertain spatio-temporal databases. Starting from the traditional definition of nearest neighbor queries and a data model for uncertain spatio-temporal data, we develop efficient query mechanisms that consider temporal dependencies during query evaluation. We define intuitive query semantics, aiming not only at returning the objects closest to the query but also their probability of being a nearest neighbor. After theoretically evaluating these query predicates we develop efficient querying algorithms for the proposed query predicates. Given the findings of this research on nearest neighbor queries, we extend these results to reverse nearest neighbor queries. Finally we address the problem of querying large datasets containing set-based objects, namely image databases, where images are represented by (multi-)sets of vectors and additional metadata describing the position of features in the image. We aim at reducing the number of kNN queries performed during query processing and evaluate a modified pipeline that aims at optimizing the query accuracy at a small number of kNN queries. Additionally, as feature representations in object recognition are moving more and more from the real-valued domain to the binary domain, we evaluate efficient indexing techniques for binary feature vectors.

    Scaling limits of random trees and graphs

    Scaling limits of random trees and graphs
    In this thesis, we establish the scaling limit of several models of random trees and graphs, enlarging and completing the now long list of random structures that admit David Aldous' continuum random tree (CRT) as scaling limit. Our results answer important open questions, in particular the conjecture by Aldous for the scaling limit of random unlabelled unrooted trees. We also show that random graphs from subcritical graph classes admit the CRT as scaling limit, proving (in a strong from) a conjecture by Marc Noy and Michael Drmota, who conjectured a limit for the diameter of these graphs. Furthermore, we provide a new proof for results by Bénédicte Haas and Grégory Miermont regarding the scaling limits of random Pólya trees, extending their result to random Pólya trees with arbitrary vertex-degree restrictions.

    Spectral and dynamical properties of certain quantum hamiltonians in dimension two

    Spectral and dynamical properties of certain quantum hamiltonians in dimension two
    After 2004, when it was possible for the first time to isolate graphene flakes, the interest in quantum mechanics of plain systems has been intensified significantly. In graphene, that is a single layer of carbon atoms aligned in a regular hexagonal structure, the generator of dynamics near the band edge is the massless Dirac operator in dimension two. We investigate the spectrum of the two-dimensional massless Dirac operator H_D coupled to an external electro-magnetic field. More precisely, our focus lies on the characterisation of the spectrum σ(H_D) for field configurations that are generated by unbounded electric and magnetic potentials. We observe that the existence of gaps in σ(H_D) depends on the ratio V^2/B at infinity, which is a ratio of the electric potential V and the magnetic field B. In particular, a sharp bound on V^2/B is given, below which σ(H_D) is purely discrete. Further, we show that if the ratio V^2/B is unbounded at infinity, H_D has no spectral gaps for a huge class of fields B and potentials V . The latter statement leads to examples of two-dimensional massless Dirac operators with dense pure point spectrum. We extend the ideas, developed for H_D, to the classical Pauli (and the magnetic Schrödinger) operator in dimension two. It turns out that also such non-relativistic operators with a strong repulsive potential do admit criteria for spectral gaps in terms of B and V . Similarly as in the case of the Dirac operator, we show that those gaps do not occur in general if |V| is dominating B at infinity. It should be mentioned that this leads to a complete characterisation of the spectrum of certain Pauli (and Schrödinger) operators with very elementary, rotationally symmetric field configurations. Considering for the Dirac operator H_D the regime of a growing ratio V^2/B, there happens a transition from pure point to continuous spectrum. A phenomenon that is particularly interesting from the dynamical point of view. Therefore, we address in a second part of the thesis the question under which spectral conditions ballistic wave package spreading in two-dimensional Dirac systems is possible. To be more explicit, we study the following problem: Do statements on the spectral type of H_D already suffice to decide whether the time mean of the expectation value $$\frac{1}{T} \int_0^T \sps{\psi(t)}{|\bx|^2\psi(t)} \rd t $$ behaves like T^2? Here ψ(t) denotes the time evolution of a state ψ under the corresponding Dirac operator. We can answer that question affirmatively, at least for certain electro-magnetic fields with symmetry.

    Spectral and Eigenfunction correlations of finite-volume Schrödinger operators

    Spectral and Eigenfunction correlations of finite-volume Schrödinger operators
    The goal of this thesis is a mathematical understanding of a phenomenon called Anderson's Orthogonality in the physics literature. Given two non-interacting fermionic systems which differ by an exterior short-range scattering potential, the scalar product of the corresponding ground states is predicted to decay algebraically in the thermodynamic limit. This decay is referred to as Anderson's orthogonality catastrophe in the physics literature and goes back to P.W.Anderson [Phys. Rev. Lett. 18:1049--1051] and is used to explain anomalies in the X-ray absorption spectrum of metals. We call this scalar product $S_N^L$, where $N$ refers to the particle number and $L$ to the diameter of the considered fermionic system. This decay was proven in the works [Commun. Math. Phys. 329:979--998] and [arXiv:1407.2512] for rather general pairs of Schrödinger operators in arbitrary dimension $d\in\N$, i.e. $|S_N^L|^2\le L^{-\gamma}$ in the thermodynamic limit $N/L^d\to \rho>0$ approaching a positive particle density. In the general case, the biggest found decay exponent is given by $\gamma=\frac 1 {\pi^2} \norm{\arcsin|T/2|}_{\text{HS}}$, where T refers to the scattering T-matrix. In this thesis, we prove such upper bounds in more general situations than considered in both [Commun. Math. Phys. 329:979--998] and [arXiv:1407.2512]. Furthermore, we provide the first rigorous proof of the exact asymptotics Anderson predicted. We prove that in the $3$-dimensional toy model of a Dirac-$\delta$ perturbation that the exact decay exponent is given by $\zeta:= \delta^2/ \pi^2$. Here, $\delta$ refers to the s-wave scattering phase shift. In particular, this result shows that the previously found decay exponent $\gamma$ does not provide the correct asymptotics of $\S_N^L$ in general. Since the decay exponent is expressed in terms of scattering theory, these bounds depend on the existence of absolutely continuous spectrum of the underlying Schrödinger operators. We are able to deduce a different behavior in the contrary situation of Anderson localization. We prove the non-vanishing of the expectation value of the non-interacting many-body scalar product in the thermodynamic limit. Apart from the behavior of the scalar product of the non-interacting ground states, we also study the asymptotics of the difference of the ground-state energies. We show that this difference converges in the thermodynamic limit to the integral of the spectral-shift function up to the Fermi energy. Furthermore, we quantify the error for models on the half-axis and show that higher order error terms depend on the particular thermodynamic limit chosen.

    Lorentz invariant quantum dynamics in the multi-time formalism

    Lorentz invariant quantum dynamics in the multi-time formalism
    The present work deals with the idea of a multi-time wave function, i.e. a wave function with N space-time arguments for N particles. Firstly, a careful derivation of the necessity of multi-time wave functions in relativistic quantum mechanics is given and a general formalism is developed. Secondly, the physical meaning of multi-time wave functions is discussed in connection with realistic relativistic quantum theories, in particular the "Hypersurface Bohm-Dirac" model. Thirdly, a first interacting model for multi-time wave functions of two Dirac particles in 1+1 space-time dimensions is constructed. Interaction is achieved by means of boundary conditions on configuration space-time, a mechanism closely related to zero-range physics. This is remarkable, as a restrictive consistency condition rules out various types of interaction and consequently no rigorous interacting model was known before. Fourthly, the model is extended to more general types of interaction and to the N-particle case. Higher dimensions are also discussed. Finally, the "Two-Body Dirac equations" of constraint theory are placed within the context of the multi-time formalism. In particular, the question of probability conservation is critically discussed, leading to further implications both for fundamental and applied questions.
    Logo

    © 2024 Podcastworld. All rights reserved

    Stay up to date

    For any inquiries, please email us at hello@podcastworld.io