Information Theory
See recent articles
- [1] arXiv:2409.12308 [pdf, html, other]
-
Title: Robust DOA Estimation Based on Dual Lawson Norm for RIS-Aided Wireless Communication SystemsComments: 10 figures, 28 pagesSubjects: Information Theory (cs.IT)
Reconfigurable intelligent surfaces (RIS) can actively perform beamforming and have become a crucial enabler for wireless systems in the future. The direction-of-arrival (DOA) estimates of RIS received signals can help design the reflection control matrix and improve communication quality. In this paper, we design a RIS-assisted system and propose a robust Lawson norm-based multiple-signal-classification (LN-MUSIC) DOA estimation algorithm for impulsive noise, which is divided into two parts. The first one, the non-convex Lawson norm is used as the error criterion along with a regularization constraint to formulate the optimization problem. Then, a Bregman distance based alternating direction method of multipliers is used to solve the problem and recover the desired signal. The second part is to use the multiple signal classification (MUSIC) to find out the DOAs of targets based on their sparsity in the spatial domain. In addition, we also propose a RIS control matrix optimization strategy that requires no channel state information, which effectively enhances the desired signals and improves the performance of the LN-MUSIC algorithm. A Cramer-Rao-lower-bound (CRLB) of the proposed DOA estimation algorithm is presented and verifies its feasibility. Simulated results show that the proposed robust DOA estimation algorithm based on the Lawson norm can effectively suppress the impact of large outliers caused by impulsive noise on the estimation results, outperforming existing methods.
- [2] arXiv:2409.12387 [pdf, html, other]
-
Title: On the Regret of Coded Caching with Adversarial RequestsSubjects: Information Theory (cs.IT); Machine Learning (cs.LG)
We study the well-known coded caching problem in an online learning framework, wherein requests arrive sequentially, and an online policy can update the cache contents based on the history of requests seen thus far. We introduce a caching policy based on the Follow-The-Perturbed-Leader principle and show that for any time horizon T and any request sequence, it achieves a sub-linear regret of \mathcal{O}(\sqrt(T) ) with respect to an oracle that knows the request sequence beforehand. Our study marks the first examination of adversarial regret in the coded caching setup. Furthermore, we also address the issue of switching cost by establishing an upper bound on the expected number of cache updates made by our algorithm under unrestricted switching and also provide an upper bound on the regret under restricted switching when cache updates can only happen in a pre-specified subset of timeslots. Finally, we validate our theoretical insights with numerical results using a real-world dataset
- [3] arXiv:2409.12710 [pdf, html, other]
-
Title: Age of gossip from connective properties via first passage percolationComments: 12 pagesSubjects: Information Theory (cs.IT); Probability (math.PR)
In gossip networks, a source node forwards time-stamped updates to a network of observers according to a Poisson process. The observers then update each other on this information according to Poisson processes as well. The Age of Information (AoI) of a given node is the difference between the current time and the most recent time-stamp of source information that the node has received.
We provide a method for evaluating the AoI of a node in terms of first passage percolation. We then use this distributional identity to prove matching upper and lower bounds on the AoI in terms of connectivity properties of the underlying network. In particular, if one sets $X_v$ to be the AoI of node $v$ on a finite graph $G$ with $n$ nodes, then we define $m_\ast = \min\{m : m \cdot |B_m(v)| \geq n\}$ where $B_m(v)$ is the ball of radius $m$ in $G$. In the case when the maximum degree of $G$ is bounded by $\Delta$ we prove $\mathbb{E} X_v = \Theta_\Delta(m_\ast)$.
As corollaries, we solve multiple open problems in the literature such as showing the age of information on a subset of $\mathbb{Z}^d$ is $\Theta(n^{1/(d+1)})$.
We also demonstrate examples of graphs with AoI scaling like $n^{\alpha}$ for each $\alpha \in (0,1/2)$. These graphs are not vertex-transitive and in fact we show that if one considers the AoI on a graph coming from a vertex-transitive infinite graph then either $\mathbb{E} X_v = \Theta(n^{1/k})$ for some integer $k \geq 2$ or $\mathbb{E} X_v = n^{o(1)}$. - [4] arXiv:2409.12851 [pdf, html, other]
-
Title: Harnessing Stacked Intelligent Metasurface for Enhanced Cell-Free Massive MIMO Systems: A Low-Power and Cost ApproachSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
In this paper, we explore the integration of low-power, low-cost stacked intelligent metasurfaces (SIM) into cell-free (CF) massive multiple-input multiple-output (mMIMO) systems to enhance access point (AP) capabilities and address high power consumption and cost challenges. Specifically, we investigate the uplink performance of a SIM-enhanced CF mMIMO system and propose a novel system framework. First, the closed-form expressions of the spectral efficiency (SE) are obtained using the unique two-layer signal processing framework of CF mMIMO systems. Second, to mitigate inter-user interference, an interference-based greedy algorithm for pilot allocation is introduced. Third, a wave-based beamforming algorithm for SIM is proposed, based only on statistical channel state information, which effectively reduces the fronthaul costs. Finally, a max-min SE power control algorithm is proposed to improve the performance of UE with inferior channel conditions. The results indicate that increasing the number of SIM layers and meta-atoms leads to significant performance improvements and allows for a reduction in the number of APs and AP antennas, thus lowering the costs. In particular, the best SE performance is achieved with the deployment of 20 APs plus 1200 SIM meta-atoms. Finally, the proposed wave-based beamforming algorithm can enhance the SE performance of SIM-enhanced CF-mMIMO systems by 57\%, significantly outperforming traditional CF mMIMO systems.
- [5] arXiv:2409.12870 [pdf, html, other]
-
Title: Joint AP-UE Association and Precoding for SIM-Aided Cell-Free Massive MIMO SystemsSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Cell-free (CF) massive multiple-input multiple-output (mMIMO) systems are emerging as promising alternatives to cellular networks, especially in ultra-dense environments. However, further capacity enhancement requires the deployment of more access points (APs), which will lead to high costs and high energy consumption. To address this issue, in this paper, we explore the integration of low-power, low-cost stacked intelligent metasurfaces (SIM) into CF mMIMO systems to enhance AP capabilities. The key point is that SIM performs precoding-related matrix operations in the wave domain. As a consequence, each AP antenna only needs to transmit data streams for a single user equipment (UE), eliminating the need for complex baseband digital precoding. Then, we formulate the problem of joint AP-UE association and precoding at APs and SIMs to maximize the system sum rate. Due to the non-convexity and high complexity of the formulated problem, we propose a two-stage signal processing framework to solve it. In particular, in the first stage, we propose an AP antenna greedy association (AGA) algorithm to minimize UE interference. In the second stage, we introduce an alternating optimization (AO)-based algorithm that separates the joint power and wave-based precoding optimization problem into two distinct sub-problems: the complex quadratic transform method is used for AP antenna power control, and the projection gradient ascent (PGA) algorithm is employed to find suboptimal solutions for the SIM wave-based precoding. Finally, the numerical results validate the effectiveness of the proposed framework and assess the performance enhancement achieved by the algorithm in comparison to various benchmark schemes. The results show that, with the same number of SIM meta-atoms, the proposed algorithm improves the sum rate by approximately 275% compared to the benchmark scheme.
New submissions for Friday, 20 September 2024 (showing 5 of 5 entries )
- [6] arXiv:2409.12343 (cross-list from math.OC) [pdf, html, other]
-
Title: Seperable Bregman Framework for Sparsity Constrained Nonlinear OptimizationSubjects: Optimization and Control (math.OC); Information Theory (cs.IT)
This paper considers the minimization of a continuously differentiable function over a cardinality constraint. We focus on smooth and relatively smooth functions. These smoothness criteria result in new descent lemmas. Based on the new descent lemmas, novel optimality conditions and algorithms are developed, which extend the previously proposed hard-thresholding algorithms. We give a theoretical analysis of these algorithms and extend previous results on properties of iterative hard thresholding-like algorithms. In particular, we focus on the weighted $\ell_2$ norm, which requires efficient solution of convex subproblems. We apply our algorithms to compressed sensing problems to demonstrate the theoretical findings and the enhancements achieved through the proposed framework.
- [7] arXiv:2409.12379 (cross-list from cs.CV) [pdf, html, other]
-
Title: Enhancing 3D Robotic Vision Robustness by Minimizing Adversarial Mutual Information through a Curriculum Training ApproachSubjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Robotics (cs.RO)
Adversarial attacks exploit vulnerabilities in a model's decision boundaries through small, carefully crafted perturbations that lead to significant mispredictions. In 3D vision, the high dimensionality and sparsity of data greatly expand the attack surface, making 3D vision particularly vulnerable for safety-critical robotics. To enhance 3D vision's adversarial robustness, we propose a training objective that simultaneously minimizes prediction loss and mutual information (MI) under adversarial perturbations to contain the upper bound of misprediction errors. This approach simplifies handling adversarial examples compared to conventional methods, which require explicit searching and training on adversarial samples. However, minimizing prediction loss conflicts with minimizing MI, leading to reduced robustness and catastrophic forgetting. To address this, we integrate curriculum advisors in the training setup that gradually introduce adversarial objectives to balance training and prevent models from being overwhelmed by difficult cases early in the process. The advisors also enhance robustness by encouraging training on diverse MI examples through entropy regularizers. We evaluated our method on ModelNet40 and KITTI using PointNet, DGCNN, SECOND, and PointTransformers, achieving 2-5% accuracy gains on ModelNet40 and a 5-10% mAP improvement in object detection. Our code is publicly available at this https URL.
- [8] arXiv:2409.12491 (cross-list from math.ST) [pdf, html, other]
-
Title: Two New Families of Local Asymptotically Minimax Lower Bounds in Parameter EstimationComments: 26 pages, submitted for publicationSubjects: Statistics Theory (math.ST); Information Theory (cs.IT)
We propose two families of asymptotically local minimax lower bounds on parameter estimation performance. The first family of bounds applies to any convex, symmetric loss function that depends solely on the difference between the estimate and the true underlying parameter value (i.e., the estimation error), whereas the second is more specifically oriented to the moments of the estimation error. The proposed bounds are relatively easy to calculate numerically (in the sense that their optimization is over relatively few auxiliary parameters), yet they turn out to be tighter (sometimes significantly so) than previously reported bounds that are associated with similar calculation efforts, across a variety of application examples. In addition to their relative simplicity, they also have the following advantages: (i) Essentially no regularity conditions are required regarding the parametric family of distributions; (ii) The bounds are local (in a sense to be specified); (iii) The bounds provide the correct order of decay as functions of the number of observations, at least in all examples examined; (iv) At least the first family of bounds extends straightforwardly to vector parameters.
- [9] arXiv:2409.12579 (cross-list from math.CO) [pdf, html, other]
-
Title: Sharp estimates for Gowers norms on discrete cubesComments: 22 pages, Mathematica notebook attachedSubjects: Combinatorics (math.CO); Information Theory (cs.IT); Classical Analysis and ODEs (math.CA)
We study optimal dimensionless inequalities $$ \|f\|_{U^k} \leq \|f\|_{\ell^{p_{k,n}}} $$ that hold for all functions $f\colon\mathbb{Z}^d\to\mathbb{C}$ supported in $\{0,1,\ldots,n-1\}^d$ and estimates $$ \|1_A\|_{U^k}^{2^k}\leq |A|^{t_{k,n}} $$ that hold for all subsets $A$ of the same discrete cubes. A general theory, analogous to the work of de Dios Pont, Greenfeld, Ivanisvili, and Madrid, is developed to show that the critical exponents are related by $p_{k,n} t_{k,n} = 2^k$. This is used to prove the three main results of the paper: an explicit formula for $t_{k,2}$, which generalizes a theorem by Kane and Tao, two-sided asymptotic estimates for $t_{k,n}$ as $n\to\infty$ for a fixed $k\geq2$, which generalize a theorem by Shao, and a precise asymptotic formula for $t_{k,n}$ as $k\to\infty$ for a fixed $n\geq2$.
Cross submissions for Friday, 20 September 2024 (showing 4 of 4 entries )
- [10] arXiv:2402.09117 (replaced) [pdf, html, other]
-
Title: Deterministic identification over channels with finite output: a dimensional perspective on superlinear ratesComments: 39 pages, 5 figuresSubjects: Information Theory (cs.IT); Quantum Physics (quant-ph)
Following initial work by JaJa, Ahlswede and Cai, and inspired by a recent renewed surge in interest in deterministic identification (DI) via noisy channels, we consider the problem in its generality for memoryless channels with finite output, but arbitrary input alphabets. Such a channel is essentially given by its output distributions as a subset in the probability simplex. Our main findings are that the maximum length of messages thus identifiable scales superlinearly as $R\,n\log n$ with the block length $n$, and that the optimal rate $R$ is bounded in terms of the covering (aka Minkowski, or Kolmogorov, or entropy) dimension $d$ of a certain algebraic transformation of the output set: $\frac14 d \leq R \leq \frac12 d$. Remarkably, both the lower and upper Minkowski dimensions play a role in this result. Along the way, we present a "Hypothesis Testing Lemma" showing that it is sufficient to ensure pairwise reliable distinguishability of the output distributions to construct a DI code. Although we do not know the exact capacity formula, we can conclude that the DI capacity exhibits superactivation: there exist channels whose capacities individually are zero, but whose product has positive capacity. We also generalise these results to classical-quantum channels with finite-dimensional output quantum system, in particular to quantum channels on finite-dimensional quantum systems under the constraint that the identification code can only use tensor product inputs.
- [11] arXiv:2304.12751 (replaced) [pdf, other]
-
Title: Centrality-Based Node Feature Augmentation for Robust Network AlignmentComments: 19 pages, 12 figures, 5 tables; its conference version was presented at the ACM International Conference on Information and Knowledge Management (CIKM 2022)Subjects: Social and Information Networks (cs.SI); Information Theory (cs.IT); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Networking and Internet Architecture (cs.NI)
Network alignment (NA) is the task of discovering node correspondences across multiple networks. Although NA methods have achieved remarkable success in a myriad of scenarios, their effectiveness is not without additional information such as prior anchor links and/or node features, which may not always be available due to privacy concerns or access restrictions. To tackle this challenge, we propose Grad-Align+, a novel NA method built upon a recent state-of-the-art NA method, the so-called Grad-Align, that gradually discovers a part of node pairs until all node pairs are found. In designing Grad-Align+, we account for how to augment node features in the sense of performing the NA task and how to design our NA method by maximally exploiting the augmented node features. To achieve this goal, Grad-Align+ consists of three key components: 1) centrality-based node feature augmentation (CNFA), 2) graph neural network (GNN)-aided embedding similarity calculation alongside the augmented node features, and 3) gradual NA with similarity calculation using aligned cross-network neighbor-pairs (ACNs). Through comprehensive experiments, we demonstrate that Grad-Align+ exhibits (a) the superiority over benchmark NA methods, (b) empirical validations as well as our theoretical findings to see the effectiveness of CNFA, (c) the influence of each component, (d) the robustness to network noises, and (e) the computational efficiency.
- [12] arXiv:2402.14053 (replaced) [pdf, html, other]
-
Title: Self-adhesivity in lattices of abstract conditional independence modelsComments: 39 pages, 4 figures; v2: major revisionSubjects: Combinatorics (math.CO); Information Theory (cs.IT)
We introduce an algebraic concept of the frame for abstract conditional independence (CI) models, together with basic operations with respect to which such a frame should be closed: copying and marginalization. Three standard examples of such frames are (discrete) probabilistic CI structures, semi-graphoids and structural semi-graphoids. We concentrate on those frames which are closed under the operation of set-theoretical intersection because, for these, the respective families of CI models are lattices. This allows one to apply the results from lattice theory and formal concept analysis to describe such families in terms of implications among CI statements.
The central concept of this paper is that of self-adhesivity defined in algebraic terms, which is a combinatorial reflection of the self-adhesivity concept studied earlier in context of polymatroids and information theory. The generalization also leads to a self-adhesivity operator defined on the hyper-level of CI frames. We answer some of the questions related to this approach and raise other open questions.
The core of the paper is in computations. The combinatorial approach to computation might overcome some memory and space limitation of software packages based on polyhedral geometry, in particular, if SAT solvers are utilized. We characterize some basic CI families over 4 variables in terms of canonical implications among CI statements. We apply our method in information-theoretical context to the task of entropic region demarcation over 5 variables. - [13] arXiv:2408.07067 (replaced) [pdf, html, other]
-
Title: Asymptotic quantification of entanglement with a single copyComments: 18+18 pagesSubjects: Quantum Physics (quant-ph); Information Theory (cs.IT); Mathematical Physics (math-ph)
Despite the central importance of quantum entanglement in fueling many quantum technologies, the understanding of the optimal ways to exploit it is still beyond our reach, and even measuring entanglement in an operationally meaningful way is prohibitively difficult. This is due to the need to precisely characterise many-copy, asymptotic protocols for entanglement processing. Here we overcome these issues by introducing a new way of benchmarking the fundamental protocol of entanglement distillation (purification), where instead of measuring its asymptotic yield, we focus on the best achievable error. We connect this formulation of the task with an information-theoretic problem in composite quantum hypothesis testing known as generalised Sanov's theorem. By solving the latter problem -- which had no previously known solution even in classical information theory -- we thus compute the optimal asymptotic error exponent of entanglement distillation. We show this asymptotic solution to be given by the reverse relative entropy of entanglement, a single-letter quantity that can be evaluated using only a single copy of a quantum state, which is a unique feature among operational measures of entanglement. Altogether, we thus demonstrate a measure of entanglement that admits a direct operational interpretation as the optimal asymptotic rate of an important entanglement manipulation protocol while enjoying an exact, single-letter formula.
- [14] arXiv:2409.11693 (replaced) [pdf, html, other]
-
Title: On the second-order zero differential properties of several classes of power functions over finite fieldsSubjects: Cryptography and Security (cs.CR); Information Theory (cs.IT)
Feistel Boomerang Connectivity Table (FBCT) is an important cryptanalytic technique on analysing the resistance of the Feistel network-based ciphers to power attacks such as differential and boomerang attacks. Moreover, the coefficients of FBCT are closely related to the second-order zero differential spectra of the function $F(x)$ over the finite fields with even characteristic and the Feistel boomerang uniformity is the second-order zero differential uniformity of $F(x)$. In this paper, by computing the number of solutions of specific equations over finite fields, we determine explicitly the second-order zero differential spectra of power functions $x^{2^m+3}$ and $x^{2^m+5}$ with $m>2$ being a positive integer over finite field with even characteristic, and $x^{p^k+1}$ with integer $k\geq1$ over finite field with odd characteristic $p$. It is worth noting that $x^{2^m+3}$ is a permutation over $\mathbb{F}_{2^n}$ and only when $m$ is odd, $x^{2^m+5}$ is a permutation over $\mathbb{F}_{2^n}$, where integer $n=2m$. As a byproduct, we find $F(x)=x^4$ is a PN and second-order zero differentially $0$-uniform function over $\mathbb{F}_{3^n}$ with odd $n$. The computation of these entries and the cardinalities in each table aimed to facilitate the analysis of differential and boomerang cryptanalysis of S-boxes when studying distinguishers and trails.