Privacy and encryption

As a protocol for peer-to-peer data storage and delivery, IPFS is a public network: Nodes participating in the network store data affiliated with globally consistent content addresses (CIDs) and advertise that they have those CIDs available for other nodes to use through publicly viewable distributed hash tables (DHTs). This paradigm is one of IPFS's core strengths — at its most basic, it's essentially a globally distributed "server" of the network's total available data, referenceable both by the content itself (those CIDs) and by the participants (the nodes) who have or want the content.

What this does mean, however, is that IPFS itself isn't explicitly protecting knowledge about CIDs and the nodes that provide or retrieve them. This isn't something unique to the distributed web; on both the d-web and the legacy web, traffic and other metadata can be monitored in ways that can infer a lot about a network and its users. Some key details on this are outlined below, but in short: While IPFS traffic between nodes is encrypted, the metadata those nodes publish to the DHT is public. Nodes announce a variety of information essential to the DHT's function — including their unique node identifiers (PeerIDs) and the CIDs of data that they're providing — and because of this, information about which nodes are retrieving and/or reproviding which CIDs is publicly available.

So, why doesn't the IPFS protocol itself explicitly have a privacy layer built-in? This is in line with key principles of the protocol's highly modular design — after all, different uses of IPFS over its lifetime may call for different approaches to privacy. Explicitly implementing an approach to privacy within the IPFS core could "box in" future builders due to a lack of modularity, flexibility, and future-proofing. On the other hand, freeing those building on IPFS to use the best privacy approach for the situation at hand ensures IPFS is useful to as many as possible.

If you're worried about the implications of this, it might be worth taking additional measures such as disabling reproviding, encrypting sensitive content, or even running a private IPFS network if that's appropriate for you.

P2P 데이터 저장 및 전송을 위한 프로토콜인 IPFS는 공용 네트워크입니다: 네트워크에 참여하는 노드는 전 세계적으로 일관된 콘텐츠 주소(CID)**와 관련된 데이터를 저장하고, 공개적으로 볼 수 있는 분산 해시 테이블(DHT)을 통해 다른 노드가 사용할 수 있는 CID를 보유하고 있음을 알립니다. 이 패러다임은 IPFS의 핵심 강점 중 하나로, 가장 기본적으로 네트워크에서 사용 가능한 전체 데이터의 전 세계적으로 분산된 "서버"로서 콘텐츠 자체(해당 CID)와 콘텐츠를 보유하거나 원하는 참여자(노드) 모두가 참조할 수 있습니다.

그러나 이것이 의미하는 바는 IPFS 자체가 CID와 이를 제공하거나 검색하는 노드에 대한 지식을 *명확하게 보호하지 않는다는 것입니다. 이는 분산 웹에만 국한된 문제가 아니며, d-웹과 레거시 웹 모두에서 트래픽 및 기타 메타데이터는 네트워크와 사용자에 대해 많은 것을 유추할 수 있는 방식으로 모니터링될 수 있습니다. 이에 대한 몇 가지 주요 세부 사항은 아래에 설명되어 있지만 간단히 요약하면, 노드 간의 IPFS 트래픽은 암호화되지만 해당 노드가 DHT에 게시하는 메타데이터는 공개됩니다. 노드는 고유한 노드 식별자(PeerID)와 제공하는 데이터의 CID를 포함하여 DHT의 기능에 필수적인 다양한 정보를 발표하며, 이 때문에 어떤 노드가 어떤 CID를 검색 및/또는 다시 제공하는지에 대한 정보는 공개적으로 사용할 수 있습니다.

그렇다면 왜 IPFS 프로토콜 자체에 개인정보 보호 계층이 명시적으로 내장되어 있지 않을까요? 이는 프로토콜의 고도로 모듈화된 설계의 핵심 원칙에 따른 것으로, 결국 프로토콜의 수명 기간 동안 IPFS를 사용하는 용도에 따라 개인정보 보호에 대한 다른 접근 방식이 필요할 수 있기 때문입니다. IPFS 코어 내에서 개인정보 보호에 대한 접근 방식을 명시적으로 구현하면 모듈성, 유연성, 미래 보장성이 부족하여 향후 빌더를 '박스화'할 수 있습니다. 반면에 IPFS를 기반으로 구축하는 사람들이 당면한 상황에 가장 적합한 개인정보 보호 접근 방식을 사용할 수 있도록 자유를 주면 가능한 한 많은 사람들에게 IPFS가 유용하게 사용될 수 있습니다.

이로 인한 영향이 걱정된다면 재제공을 비활성화하거나, 민감한 콘텐츠를 암호화하거나, 적절한 경우 프라이빗 IPFS 네트워크를 실행하는 등의 추가 조치를 취하는 것이 좋습니다.

TIP

While IPFS traffic between nodes is encrypted, the essential metadata that nodes publish to the DHT — including their unique node identifiers (PeerIDs) and the CIDs of data that they're providing — is public. If you're worried about the implications of this for your personal use case, it's worth taking additional measures.

노드 간의 IPFS 트래픽은 암호화되지만, 노드가 DHT에 게시하는 필수 메타데이터(고유한 노드 식별자(PeerID) 및 제공하는 데이터의 CID 포함)는 공개됩니다. 이것이 개인 사용 사례에 미칠 영향이 걱정된다면 추가적인 조치를 취하는 것이 좋습니다.

#What's public on IPFS

All traffic on IPFS is public, including the contents of files themselves, unless they're encrypted. For purposes of understanding IPFS privacy, this may be easiest to think about in two halves: content identifiers (CIDs) and IPFS nodes themselves.

암호화되지 않는 한 파일 자체의 콘텐츠를 포함하여 IPFS의 모든 트래픽은 공개됩니다. IPFS 프라이버시를 이해하기 위해서는 콘텐츠 식별자(CID)와 IPFS 노드 자체의 두 가지로 나누어 생각하는 것이 가장 쉬울 수 있습니다.

#Content identifiers

Because IPFS uses content addressing rather than the legacy web's method of location addressing, each piece of data stored in the IPFS network gets its own unique content identifier (CID). Copies of the data associated with that CID can be stored in any number of locations worldwide on any number of participating IPFS nodes. To make retrieving the data associated with a particular CID efficient and robust, IPFS uses a distributed hash table (DHT) to keep track of what's stored where. When you use IPFS to retrieve a particular CID, your node queries the DHT to find the closest nodes to you with that item — and by default also agrees to re-provide that CID to other nodes for a limited time until periodic "garbage collection" clears your cache of content you haven't used in a while. You can also "pin" CIDs that you want to make sure are never garbage-collected — either explicitly using IPFS's low-level pin API or implicitly using the Mutable File System (MFS) — which also means you're acting as a permanent reprovider of that data.

This is one of the advantages of IPFS over traditional legacy web hosting. It means retrieving files — especially popular ones that exist on lots of nodes in the network — can be faster and more bandwidth-efficient. However, it's important to note that those DHT queries happen in public. Because of this, it's possible that third parties could be monitoring this traffic to determine what CIDs are being requested, when, and by whom. As IPFS continues to grow in popularity, it's more likely that such monitoring will exist.

**#What's public on IPFS**

**#Content identifiers**

#What's public on IPFS

#Content identifiers