Peiwei's Blog

下笔时才发现，此文距离《为何饭不可过饱》恰好一个月。在这一个月里，我尽力践行自己唠叨的“理论”，少吃节食。每日的三餐基本安排为早饭：豆浆与一个茶叶蛋，或者牛奶加两片吐司。午饭：一个在便利店买的三明治。晚饭：自制家常饭（绝无外卖），但是不会过量摄入；常吃的有咖喱鸡饭、蒸海鲜、牛肉、饺子等。此外，我一周平均游泳三次，并进行一两次器械训练。今日再次上称，我的体重已经比一个月前轻了10斤，实在可喜！我将继续保持这种生活方式，早日将体重降至160斤。我愈发觉得，身体对精神的影响可能要大于精神对身体的影响。

我的体重不断增加，各项体检指标警示我需要节食。节食是我最近在学习的事情，为了增强自己的自制力，这是我思考的饭不可过饱的原因。

人类历史如同一条浩浩荡荡的大河，在一如既往的平静之上，总有一些伟大的思考者翻起动人的浪花，在成百上千年后，仍然激荡着追随者的心灵。这些浪花或者因为符合当时的发展要求而形成普遍的共识，翻涌成一场思潮；或者因为超前的、不合时宜的观点而被雪藏，让史海拾遗的人感叹其时运之不济。

今天下午和同学聊天，其中从信息量的角度对老子“道可道，非常道”的解释使我耳目一新。

人工智能乘着算力发展的东风突飞猛进。从机器学习到基于神经网络的深度学习，注意力机制的引入和预训练模型的发展，到如今通用大语言模型，我感到我们离未来那么近。

Publication: ISSTA 2021 论文摘要 Software in the wild is usually released as stripped binaries that contain no debug information (e.g., function names). This paper studies the issue of reassigning descriptive names for functions to help facilitate reverse engineering. Since the essence of this issue is a data-driven prediction task, persuasive research should be based on sufficiently large-scale and diverse data. However, prior studies can only be based on small-scale datasets because their techniques suffer from heavyweight binary analysis, making them powerless in the face of big-size and large-scale binaries.

Publication: ICPC 2018 论文摘要 When code is compiled, information is lost, including some of the structure of the original source code as well as local identifier names. Existing decompilers can reconstruct much of the original source code, but typically use meaningless placeholder variables for identifier names. Using variable names which are more natural in the given context can make the code much easier to interpret, despite the fact that variable names have no effect on the execution of the program.

Publication: CCS 2022 论文摘要 Predicting function names in stripped binaries is an extremely useful but challenging task, as it requires summarizing the execution behavior and semantics of the function in human languages. Recently, there has been significant progress in this direction with machine learning. However, existing approaches fail to model the exhaustive function behavior and thus suffer from the poor generalizability to unseen binaries. To advance the state of the art, we present a function Symbol name prediction and binary Language Modeling (SymLM) framework, with a novel neural architecture that learns the comprehensive function semantics by jointly modeling the execution behavior of the calling context and instructions via a novel fusing encoder.

Publication: USENIX Security 22 论文摘要 A common tool used by security professionals for reverse engineering binaries found in the wild is the decompiler. A decompiler attempts to reverse compilation, transforming a binary to a higher-level language such as C. High-level languages ease reasoning about programs by providing useful abstractions such as loops, typed variables, and comments, but these abstractions are lost during compilation. Decompilers are able to deterministically reconstruct structural properties of code, but comments, variable names, and custom variable types are technically impossible to recover.

Ruby with version >= 2.4 is required to install one_gadget while the default ruby package on Ubuntu 16.04 is not satisfied, as shown below. If you wanna learn heap-related things on Ubuntu 16.04 to omit the additional mechanism like tcache, this post provides you the method to update ruby on Ubuntu 16.04 and install one_gadget. ERROR: Error installing one_gadget: bindata requires Ruby version >= 2.4.0. In fact, the author of one_gadget provided suggestions, see https://github.

这是我当软件安全课程助教时，在堆溢出内容后布置的一道题目，二进制文件在此。题目考察的是堆溢出safe Unlinking的问题。

在写项目的单元测试时，我需要一些函数定义作为测试用例，于是告诉ChatGPT给我一个稀奇古怪的C函数定义，并获得如下定义：

为何饭不可过饱（后续）

为何饭不可过饱

让思想成为工具而非枷锁

道可道，非常道

大模型将改变软件开发的模式

Paper Note | A Lightweight Framework for Function Name Reassignment Based on Large-Scale Stripped Binaries

Paper Note | Meaningful Variable Names for Decompiled Code: A Machine Translation Approach

Paper Note | SymLM: Predicting Function Names in Stripped Binaries via Context-Sensitive Execution-Aware Code Embeddings

Paper Note | Augmenting Decompiler Output with Learned Variable Names and Types

Install one_gadget on Ubuntu 16.04 in 2023

Write Up | HITCON Training Lab - bamboobox

一个有趣的C函数定义